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Foreword 


This  report  compares  the  current  enlisted  job  classification  algorithm,  Classification 
and  Assignment  within  PRIDE  (CLASP)  instituted  in  1981,  with  a  proposed  replacement 
algorithm,  the  Rating  Identification  Engine  (RIDE).  RIDE  was  developed  over  the 
course  of  several  years,  beginning  with  funding  from  the  Office  of  Naval  Research  (Code 
34,  PE  0603236N),  augmented  by  funding  from  Commander  Navy  Recruiting 
Command  to  accelerate  its  development.  The  motivation  to  build  a  replacement  for 
CLASP  was  two-fold.  First,  components  of  CLASP  are  not  well  documented  and  it 
executes  off  an  expensive  mainframe  computer  system.  Second,  CLASP  has  a  number  of 
“hard  coded”  components  that  are  inflexible  and  difficult  to  maintain.  In  contrast,  RIDE 
is  web-based  and  flexible.  The  flexibility  to  add  new  classification  rules,  filters,  and  tests 
was  seen  as  an  important  component  of  our  research  program  to  overhaul  and  improve 
the  Navy’s  enlisted  selection  and  classification  process. 

RIDE  substantially  met  the  design  requirements,  it  has  an  easy  to  use  interface,  can 
be  reconfigured  rapidly  and  easily,  and  most  importantly,  new  tests  or  classification 
tools  can  be  easily  integrated.  However,  RIDE  was  under  an  accelerated  development 
cycle  to  meet  deadlines  to  coincide  with  a  planned  overhaul  of  the  Navy’s  recruiting 
management  system  (of  which  CLASP  was  one  component).  As  a  result,  RIDE  was  not 
as  thoroughly  evaluated  against  CLASP  as  would  otherwise  have  been  done.  The  current 
report  provides  a  detailed  evaluation  of  both  CLASP  and  RIDE  and  compares  them  in 
terms  of  their  embedded  philosophies,  functionality,  maintenance,  and  efficacy.  In  the 
end,  it  is  clear  that  the  continued  use  of  CLASP  is  indefensible  for  a  number  of  reasons. 
Nevertheless,  there  are  several  concerns  with  RIDE  that  should  be  remedied  and  a  plan 
is  needed  to  refresh  its  parameters  to  maintain  its  integrity  across  time. 

This  specific  work  reported  here  was  supported  by  the  Navy  Personnel  Research, 
Studies,  and  Technology  department  (Ms.  Janet  Held)  through  the  U.S.  Research  Office 
of  Scientific  Services  Program  administered  by  Battelle  (Delivery  Order  296,  Contract 
No.  DAAD19-02-D-0001). 


David  L.  Alderton,  Ph.D. 

Director 
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I  ntroduction 


Across  the  United  States,  Military  Entrance  Processing  Stations  (MEPS)  process 
approximately  51,000  applicants  for  enlistment  into  the  active  U.S.  Navy  each  year.  Of 
these,  upwards  of  39,000  are  selected  for  enlistment  (Kaemmerer,  G.,  personal 
communication,  September  14,  2006).  Each  of  the  selected  applicants  must  be  classified 
into  a  Navy  rating  (i.e.,  job)  that  matches  the  individual's  abilities  and  is  needed  by  the 
Service.  To  accomplish  this,  a  job-matching  algorithm  is  employed.  The  current  Fortran- 
based  algorithm  was  originally  developed  in  the  late  1970s  and  put  into  wide-scale  Navy 
use  in  1981.  This  paper  compares  and  contrasts  the  existing  classification  software  with 
a  newly  designed  classification  algorithm. 

Rating  Identification  Engine  (Rl DE) 

The  Personalized  Recruiting  for  Immediate  and  Delayed  Entry  (PRIDE)  system  is 
the  Navy's  current  overarching  computer  system  for  processing  applicants  for 
enlistment  into  the  Navy.  The  Rating  Identification  Engine  (RIDE)  is  an  enlisted  Navy 
rating  job  classification  algorithm  that  is  designed  to  replace  the  Classification  and 
Assignment  within  PRIDE  (CLASP)  algorithm.  RIDE  consists  of  two  components:  (1) 
the  School  Pipeline  Success  Utility  (SPSU)  and  (2)  the  Armed  Forces  Qualification  Test 
(AFQT).  The  two  RIDE  components  are  designed  to  work  in  close  association  with  each 
other  as  opposed  to  the  more  or  less  independent  operation  of  the  six  CLASP 
components. 

Classification  and  Assignment  within  PRI  DE  (CLASP) 

Much  of  the  material  in  this  report  is  quoted  directly  from  Kroeker  and  Rafacz 
(1983),  which  describes  the  five  components  of  the  original  CLASP  model  implemented 
in  1981.  Kroeker  and  Folchi  (1984)  describe  the  Attrition  Component,  which  was  added 
to  CLASP  in  1983. 

The  CLASP  utility  model  was  formulated  to  ensure  consistent  application  of  Navy 
personnel  classification  policy  among  classifiers  and  from  one  assignment  to  the  next.  It 
is  comprised  of  six  components:  School  Success,  Aptitude/Complexity,  Navy 
Priority/Individual  Preference,  Minority  Fill,  Fraction  Fill,  and  Attrition.  Each 
component  was  designed  to  influence  a  composite  utility  calculation  independently  of 
the  others.  This  design  does  not  imply  strict  statistical  independence;  rather,  a  slight 
degree  of  correlation  among  the  utility  components  is  expected.  The  magnitude  of  these 
correlations  has  never  been  studied. 

The  School  Success,  Aptitude/Complexity,  Navy  Priority/Individual  Preference,  and 
Attrition  are  often  called  "Fit"  components,  because  they  optimize  job  assignments 
based  upon  psychologically-based  goodness-of-fit  measures.  The  Aptitude/Difficulty, 
Priority/Preference,  and  Attrition  components  are  very  similar  because  they  are  based 
on  policymaker  judgments  concerning  the  value  to  the  Navy  of  assigning  an  individual 
with  a  given  person  attribute  to  a  job  with  a  given  job  attribute.  The  school  success 
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component  differs  from  the  other  "Fit"  components  because  its  utility  model  is  based 
upon  the  empirical  relationship  between  “A”  School  performance  and  Armed  Services 
Vocational  Aptitude  Battery  (ASVAB)  composite  scores. 

The  Minority  Fill  and  Fraction  Fill  components  are  often  referred  to  as  "Fill" 
components,  because  they  optimize  based  upon  the  goal  of  achieving  approximately 
equal  fill  rates  during  each  recruiting  period.  The  Minority  Fill  component  focuses  upon 
achieving  appropriate  balance  between  minority  and  non-minority  accessions  in  each 
job  category,  while  the  Fraction  Fill  component  is  focused  on  achieving  uniform  quota 
fill  rates  across  job  categories. 

Using  a  utility  function  whose  mathematical  form  is  unique  to  it,  each  CLASP 
component  computes  the  raw  utility  value  of  the  prospective  person  to  job  assignment. 
Then,  using  mean  and  standard  deviation  parameters  that  describe  the  distribution  of 
utility  values  in  the  reference  population,  each  raw  utility  is  standardized  so  that  its 
mean  is  50  and  its  standard  deviation  is  10. 

Both  the  RIDE  and  CLASP  algorithms  can  be  conceptualized  as  operating  on  a  payoff 
matrix,  which  is  a  rectangular  array  of  numbers  representing  the  utilities  of  the  various 
decision  outcome  combinations.  Assume  that  there  are  m  individuals  to  be  assigned  to 
jobs  and  n  job  openings.  If  individuals  are  indexed  by  i  (1  <  i  <  m)  and  jobs  are  indexed 
by  j  (1  <j<  n),  then  the  entry  Uij  in  row  i  and  column  /  of  the  matrix  expresses  the  value 
to  the  Navy  (on  an  arbitrary  scale)  of  assigning  the  ith  person  to  the  jth  job.  Higher  payoff 
values  are  more  desirable  than  the  lower  ones,  because  Navy  policy  considers  the 
probability  of  success  on  a  job  to  be  a  monotonically  increasing  function  of  payoff  value. 
The  payoff  matrix  may  be  used  for  both  comparisons  across  jobs  and  comparisons 
across  individuals.  Thus,  Ul  y  >  Ui ;  implies  that  individual  i  is  better  suited  for  job  ji 

than  job  >,  while  U  hj  >  UUJ  implies  that  individual  i,  is  better  suited  for  job  j  than 
individual  i2. 

Ideally,  the  composite  utility  function  for  each  job  category  should  be  a  realistic 
mathematical  representation  of  the  value  of  assigning  a  given  person  to  that  job,  based 
upon  all  identifiable  factors  considered  relevant  to  the  classification  decision.  However, 
because  there  are  several  important  factors  that  neither  RIDE  nor  CLASP  are  able  to 
incorporate  into  the  classification  process,  the  goodness-of-fit  measures  they  generate 
are  often  only  a  small  part  of  the  information  factored  into  the  final  classification 
decision. 
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School  Pipeline  Success  Utility  (SPSU)  Component  of 
Rl  DE  and  School  Success  Component  of  CLASP 

Comparison 


This  section  evaluates  the  empirical  relationship  between  composite  score  and  “A” 
School  performance,  and  the  manner  in  which  that  relationship  is  incorporated  into 
CLASP  and  RIDE,  with  emphasis  on  RIDE.  In  particular,  we  want  to  know  how  well  the 
applicant's  ASVAB  composite  score  predicts  “A”  School  performance.  We  also  evaluate 
the  “Point  of  Diminishing  Return(s)”  (PDR)  concept. 

The  PDR  concept  hypothesizes  the  following  general  relationship  between  composite 
score  and  school  performance:  The  relationship  is  monotonically  increasing  at  the  lower 
end  of  the  composite  score  distribution,  including  the  region  to  the  immediate  right  of 
the  cut  score.  However,  as  the  composite  score  increases  toward  the  PDR,  the  rate  of 
performance  improvement  declines  and  eventually  flattens  out  at  the  PDR.  Between  the 
PDR  and  the  high  end  of  the  score  distribution,  school  performance  either  remains  flat 
or  declines.  The  leveling  off  or  decline  may  be  attributed  to  high  aptitude  students  who 
are  "over-qualified"  for  the  curriculum/career  path  they  are  being  considered  for  and, 
thus,  may  be  better  suited  for  a  more  challenging  training  curriculum  and/or  career 
path. 

The  following  procedure  (Folchi,  1999)  was  used  to  model  the  empirical  relationship 
between  composite  score  and  First  Pass  Pipeline  Success  (FPPS)  in  each  of  70  “A” 

School  samples  and  determine  the  PDR  in  each  sample.  The  data  consisted  of  students 
enrolled  in  the  “A”  School  training  pipelines  for  70  ratings  during  fiscal  years  1996, 

1997,  and  1998.  The  primary  ASVAB  selector  composite  score  and  FPPS  status  were 
available  for  each  student  in  each  sample.  The  dichotomous  criterion  FPPS  is  coded  as  1 
(one,  success)  if  the  student  completed  all  courses  in  his  “A”  School  pipeline  without  any 
course  failures  or  setbacks,  and  as  o  (zero,  failure)  otherwise.  The  procedure  defines  a 
methodology  for  grouping  adjacent  data  points  into  groups  (hereafter  called  "bins")  that 
are  (somewhat)  evenly  spaced  along  the  composite  score  distribution. 

Starting  at  the  high  end  of  the  distribution,  the  procedure  sequentially  constructs 
bins  by  moving  toward  the  low  end  in  bin  range  increment  of  5  points.  The  procedure 
adds  all  points  in  each  increment  to  the  bin,  and  continues  on  to  the  next  increment, 
until  a  minimum  bin  size  of  10  or  more  points  have  been  added  to  the  bin.  After  the  bin 
membership  has  been  determined  in  this  manner,  the  bin  is  identified  with  a  value  on 
the  composite  score  scale  equal  to  the  midpoint  of  the  maximum  and  minimum  of 
scores  in  all  increments  used  to  build  the  bin.  Construction  of  the  next  bin  (to  the  left  of 
the  bin  just  completed)  starts  at  the  point  immediately  to  the  left  of  the  minimum  score 
in  the  previous  bin.  The  FPPS  rate  among  students  in  the  bin  associates  each  bin  with  a 
point  on  the  conditional  probability  of  FPPS  scale  on  the  composite  score.  The  PDR  is 
found  by  determining  all  bins  whose  FPPS  rates  are  within  1  percent  of  the  bin  having 
the  largest  FPPS  rate.  The  PDR  is  the  lowest  composite  score  associated  with  the  bins 
from  this  set.  The  bin  in  which  the  PDR  is  located  is  called  the  PDR  bin  and  the  bin  in 
which  the  Cut  Score  is  located  in  called  the  Cut  Score  bin.  The  associated  points  are 
named  accordingly:  (CS,  Fes )  is  the  Cut  Score  point  and  (PDR,  Fpdr  )  is  the  PDR  point. 
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SPSU  Component  of  Rl  DE 

The  SPSU  component  description  is  based  on  Folchi  (1999).  The  description  has 
been  broken  into  3  stages  to  provide  a  more  detailed  and  understandable  explanation. 


Stage  I :  Bin  FPPS  Rate  Model 


The  Stage  I  model  is  the  result  of  the  bin  construction  algorithm  after  all  adjacent 
bins  have  been  connected  by  line  segments.  Its  equation  is  given  in  Appendix  A.  As 
shown  in  Figure  1,  the  Stage  I  model  is  piece-wise  linear  such  that  each  segment 
provides  a  linear  interpolation  estimator  of  the  conditional  probability  of  FPPS  for 
composite  scores  between  the  midpoints  of  adjacent  bins.  However,  due  to  its 
complexity,  the  Stage  I  model  was  transformed  into  Stage  II. 


Composite  Score 


—□—SPSU  1 

- SPSU  2 

—•—SPSU  3 


Figure  1.  School  Pipeline  Success  Utility 

Stage  1 1 :  Non- standardized  FPPS  Prediction  Model  with  PDR 

The  Stage  II  model  is  constructed  by  eliminating  all  bins  and  line  segments  in  the 
Stage  I  model,  except  the  Cut  Score  (CS)  and  PDR  points.  The  line  segment  between 
these  2  points  estimates  the  conditional  probability  of  FPPS  for  each  composite  score  in 
the  interval  CS  <  X  <  PDR.  For  X  <  CS,  the  SPSU  is  zero,  as  defined  by  the  horizontal 
line  starting  at  X  =  CS  - 1  and  extending  to  the  left  toward  the  minimum  composite 
score.  For  X  >  PDR,  Stage  II  model  is  defined  by  the  horizontal  line  starting  at  the  PDR 
point  and  extending  to  the  right  toward  the  maximum  composite  score. 
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As  shown  in  Figure  l,  the  Stage  II  model  simplifies  the  Stage  I  model  because  it  has 
no  more  than  three  line  segments.  It  is  considered  unstandardized  because  it  has  not 
been  adjusted  so  that  meaningful  comparisons  across  job  options  are  possible.  The 
Stage  II  equation  is  given  by: 


Regardless  of  whether  PDR  j  =  CSj  or  PDRJ  *  CSj : 

Sfj=  0  if  XtJ<CSJt 

=ioofpdr  if  PDRj<Xy  Stage  II  Equation 


If  CS,  <x.  <PDR.  and 

J  CJ  J 


S‘j=100 


FpDRj  ~  FcSj  ^ 

PDRj  -  CSj  j 


CS  j  *  PDR  j : 
(X,-CS:),Fts 


where  Fpdr  and  Fes  are  the  FPPS  rates  in  the  PDR  and  cut  score  bins,  respectively, 
CSj  =  Cut  Score  for  composite  associated  with  job  option  /, 

PDRj  =  PDR  for  job  option  /,  and 

Xij  =  ASVAB  composite  score  for  individual  i  in  job  option  /. 


The  conditional  FPPS  probabilities  provided  by  the  Stage  II  model  cannot  be 
meaningfully  compared  across  different  job  options.  If  applicants  were  assigned  to  jobs 
solely  on  the  basis  of  their  conditional  FPPS  probability,  then  most  would  be  assigned  to 
easy  schools  and  few  would  be  assigned  to  difficult  schools,  since  the  easier  schools 
generally  have  larger  FPPS  probabilities.  (Ease  and  difficulty  in  this  context  refer  to  both 
the  proportion  of  the  applicant  population  satisfying  the  ASVAB  selection  standard  and 
the  proportion  of  student  population  satisfying  the  FPPS  criterion.)  For  example, 
suppose  an  applicant  has  the  same  conditional  probability  of  FPPS  in  schools  A  and  B 
and  is  qualified  for  both  schools.  Assume  also  that  A  uses  a  more  stringent  ASVAB 
selection  criterion  than  B  and  that  A  graduates  a  smaller  proportion  of  students  than  B. 
One  may  argue  that  it  would  be  more  beneficial  to  send  this  applicant  to  A  than  to  B. 
Accordingly,  the  SPSU  Stage  III  model  adjusts  the  Stage  II  conditional  probability  of 
FPPS  estimate  for  two  measures  of  school  difficulty:  (a)  difficulty  experienced  by  the 
average  applicant  population  member  in  satisfying  the  ASVAB  qualification  standard, 
and  (b)  difficulty  experienced  by  the  average  “A”  School  qualified  student  in  satisfying 
the  FPPS  criterion. 

Another  method  of  counteracting  the  tendency  for  school  success  utility  scores  to  put 
too  many  applicants  in  easy  schools  is  to  design  the  remaining  classification  model 
components  to  compensate  for  this  tendency.  For  example,  the  CLASP 
Aptitude/Difficulty  component  counteracts  the  CLASP  School  Success  component  in 
this  respect.  In  RIDE,  both  the  transition  from  Stage  II  to  Stage  III  and  the  RIDE  AFQT 
component  fulfill  the  compensatory  role. 
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The  Hardness  index  is  a  measure  of  the  difficulty  that  the  average  applicant 
population  member  experiences  in  satisfying  the  “A”  School  ASVAB  qualification 
standard.  It  assumes  values  between  zero  and  one,  where  zero  indicates  the  minimum 
difficulty  and  one  indicates  maximum  difficulty.  The  hardness  index  for  job  option  j  is 
defined  as: 


Let  NRJO  be  the  total  number  of  RIDE  job  options. 

Let  NTj  be  the  number  of  ASVAB  subtests  in  composite  for  job  option  j. 


Let  H Max  =  Max \ 


cs_ L 

\NTj 


1  <  j  <  NRJO 


,  and 


let  a  Min  =  Min 


cs, 


\NT j 


J-  1  <j<  NRJO 


.  The  hardness  factor  is  given  by: 


CS, 


NT,. 
H ..  =  1 


J—Ht 


^  Max  ^  Min 


The  adjustment  for  the  difficulty  experienced  by  the  average  student  in  satisfying  the 
FPPS  criterion  is  determined  by  the  reciprocal  of  the  FPPS  rate  at  the  PDR.  This,  of 
course,  assumes  that  the  FPPS  rate  at  the  PDR  is  representative  of  the  FPPS  rate  of  all 
students  taking  the  course.  The  smaller  the  FPPS  rate  at  the  PDR,  the  larger  the 
reciprocal  is,  and,  therefore,  the  greater  the  difficulty  of  satisfying  the  FPPS  criterion. 
Thus,  the  transition  from  Stage  II  to  Stage  III  will  produce  a  larger  upward  shift  for  a 
school  in  which  it  is  more  difficult  to  satisfy  the  FPPS  criterion.  The  standardization 
factor  is  the  ratio  of  the  hardness  index  to  the  FPPS  rate  at  the  PDR.  The  larger  the 
hardness  index  and  the  smaller  the  FPPS  rate  at  the  PDR,  the  larger  the  standardization 
factor.  Accordingly,  the  Stage  III  model  is 


H, 


ci  ill  _ 

U  p 

x  D 


S‘j 


Stage  III  Equation 


Observe  from  Figure  l  that  the  Stage  II  and  Stage  III  models  are  discontinuous 
between  cut  score  minus  one  and  the  cut  score,  unless  the  FPPS  rate  in  the  cut  score  bin 
is  zero.  Thus,  there  can  be  a  large  difference  between  the  SPSU  value  at  the  cut  score 
and  the  SPSU  value  (zero)  everywhere  below  the  cut  score. 
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Criticisms  of  Rl  DE  SPSU  Model 


Stage  I  Model,  Bin  Construction,  and  PDR  Determination 

Any  procedure  for  grouping  data  in  this  manner  is  arbitrary.  Application  of  different 
grouping  procedures,  bin  sizes,  and  bin  range  increments  lead  to  different  bin 
memberships.  Different  bin  memberships  in  turn  produce  different  empirical 
relationships  between  FPPS  and  composite  score  and,  consequently,  different  PDRs  and 
different  SPSU  models.  Furthermore,  there  is  no  a  priori  reason  to  believe  that  any  one 
combination  of  grouping  procedure,  bin  size,  and  bin  range  increment  is  superior  to  any 
other.  Increasing  Bin  Size  improves  the  accuracy  of  the  FPPS  rate  estimates  in  each  bin. 
However,  it  does  so  by  reducing  the  number  of  bins  and  increasing  the  length  of  the 
interval  between  bins.  As  a  result,  the  identification  of  each  bin  with  a  particular 
composite  score  becomes  more  arbitrary  and  diffuse.  In  addition,  although  there  is  no  a 
priori  reason  to  believe  that  the  PDR  necessarily  exists,  the  PDR  search  procedure  has 
been  defined  in  such  a  manner  that  it  will  always  find  one. 

FPPS  School  Performance  Criterion 

Several  potential  problems  may  arise  as  a  consequence  of  using  FPPS.  To  the 
author’s  knowledge,  FPPS  has  never  been  studied  or  utilized  in  previous 
NPRDC/NPRST  selection  and  classification  research.  It  is  not  possible  to  anticipate  how 
well  it  will  perform  in  comparison  to  school  performance  measures  used  in  ASVAB 
validation  studies,  such  as  final  school  grade  (FSG). 

A  tailor-made  school  performance  measure  is  usually  developed  during  the  course  of 
performing  an  ASVAB  validation  study.  Developing  such  a  measure  is  often  difficult  and 
time-consuming  because  a  detailed  understanding  of  the  course  and  student  evaluation 
process  is  required.  However,  from  the  author's  perspective,  the  effort  generally 
produces  a  criterion  that  does  well  in  differentiating  students  from  one  another.  The 
resulting  validity  coefficients  seem,  in  general,  to  be  larger  than  those  derived  from 
more  readily  available  performance  measures,  such  as  those  obtained  from  Navy 
Integrated  Training  Resources  and  Administration  System  (NITRAS).  A  corollary  to  this 
observation  is  that  differences  between  FPPS  and  a  tailor-made  performance  criterion 
may  be  substantial  enough  to  produce  different  validation  study  outcomes.  For  example, 
the  ASVAB  composite  that  correlates  the  highest  with  FPPS  in  a  particular  “A”  School 
pipeline  may  not  be  the  same  as  the  composite  that  correlates  highest  with  a  criterion 
that  is  tailor-made  for  the  “A”  School  in  that  pipeline. 

This  has  important  implications  for  RIDE.  The  SPSU  model  and  parameters  were 
developed  using  FPPS  as  the  school  performance  measure  and  the  current  ASVAB 
selector  composite  as  the  student  aptitude  measure.  However,  no  research  has  verified 
that  the  current  ASVAB  composites  are  still  optimal  in  terms  of  their  ability  to  predict 
FPPS  in  each  rating.  It  is  possible  that  some  composite  other  than  the  current  one  better 
predicts  FPPS.  The  definition  of  FPPS  is  broad  enough  to  include  any  number  of  school 
pipeline  segments,  in  addition  to  the  “A”  School.  No  research  has  explored  the  number 
of  schools  in  the  various  pipelines,  the  nature  of  the  courses  and  curricula  associated 
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with  the  segments,  or  whether  ASVAB  aptitude  measures  are  even  relevant  in  terms  of 
their  ability  to  predict  success  in  segments  that  have  not  been  included  in  previous 
ASVAB  validation  studies. 

Another  potential  problem  is  that  many  pipelines  demonstrate  extreme  differences 
between  the  proportions  of  successes  and  failures  in  the  sample  (e.g.,  99%  FPP  success 
and  1%  FPP  failures).  Such  an  extreme  split  may  adversely  affect  the  estimation  of  the 
conditional  probability  of  FPPS,  particularly  in  the  presence  of  outliers  in  the  failure  sub 
sample. 

A  full  explanation  of  why  an  extreme  split  may  cause  problems  is  beyond  the  scope  of 
this  paper.  However,  a  very  brief  explanation  is  as  follows:  FPPS,  when  compared  to 
performance  measures  like  FSG  that  have  a  continuous,  bell-shaped  distribution,  has  a 
shortcoming  when  examined  from  a  mathematical  and  statistical  standpoint.  The 
dichotomization  of  a  continuous  performance  measure  necessitates  the  introduction  of 
an  additional  nuisance  parameter  into  the  analysis,  namely  the  location  of  the  point  on 
the  distribution  designating  the  boundary  between  the  successes  and  failures.  When  this 
point  is  near  either  extreme  of  the  distribution,  then  the  variance  of  its  estimate  is 
increased,  which  in  turn  adversely  affects  the  variances  of  the  slope  and  intercept 
parameters  in  the  conditional  probability  of  success  estimator  (Hannan  &  Tate,  1965; 
Prince  &Tate,  1966). 

Stage  1 1  Model 

Although  the  Stage  II  model  is  considerably  simpler  than  the  Stage  I  model,  it 
wastefully  discards  all  data  except  the  cut  score  and  PDR  bins.  In  addition,  the 
imposition  of  linear  relationships  may  introduce  bias  to  the  estimation  of  the 
conditional  probability  of  FPPS  at  all  points  of  the  distribution,  except  at  the  Cut  Score 
and  the  PDR.  The  Stage  I  model,  like  any  estimator,  contains  estimation  error.  However, 
each  FPPS  rate  estimate  used  to  build  the  Stage  I  model  is  unbiased  because  the 
properties  of  the  binomial  distribution  ensure  it.  The  greater  the  degree  of  non-linearity 
demonstrated  by  the  Stage  I  model,  the  greater  the  bias  introduced  as  a  result  of 
imposing  the  Stage  II  model  on  top  of  it.  Consequently,  the  Stage  II  model  is 
contaminated  by  both  estimation  error  (inherited  from  the  Stage  I  model)  and  bias 
(from  imposing  linear  relationships  that  may  not  have  existed  in  Stage  I). 

Stage  1 1 1  Model 

Adjusting  the  Stage  II  model  for  difficulty  in  satisfying  the  FPPS  criterion  is  a 
reasonable  standardization  technique.  However,  a  broader,  more  stable  school  difficulty 
measure  than  FPPS  rate  in  the  PDR  bin  should  be  used.  The  overall  FPPS  rate  in  the 
school  sample  seems  more  reasonable. 

Bin  Model  Evaluation 

Both  subjective  and  objective  evaluations  of  the  Bin  models  were  performed. 
Subjective  evaluations  were  performed  by  a  committee  consisting  of  Janet  Held  and 
Geoff  Fedak  of  Navy  Personnel  Research,  Studies,  and  Technology  (NPRST),  and  the 
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author.  Each  committee  member  studied  graphical  displays  of  the  70  bin  models  and 
judged  whether  each  display  indicated  the  presence  or  absence  of  a  PDR.  A  majority 
vote  on  each  display  indicated  that  a  PDR  was  present  in  25  out  of  the  70  models 
(35-7%)- 

The  objective  evaluation  consisted  of  a  statistical  comparison  of  each  bin  model  with 
a  model  developed  using  a  baseline  methodology.  In  the  author's  opinion,  an  objective 
evaluation  of  the  bin  construction  process  required  a  baseline  methodology  for 
estimating  the  conditional  probability  of  FPPS  at  a  given  composite  score.  The  bin  and 
the  baseline  methodologies  were  compared  on  the  basis  of  the  accuracy  of  their 
respective  predictions  of  conditional  probability  of  FPPS.  Two  logistic  regression  model 
prototypes,  the  quadratic  logistic  regression  model  (QLRM)  and  the  linear  logistic 
regression  model  (LLRM),  were  selected  for  the  baseline  role.  Logistic  regression  is  a 
standard  methodology  for  estimating  the  conditional  mean  of  a  dichotomous  criterion 
variable  such  as  FPPS  (Hosmer  &  Lemeshow,  1989). 

The  testing  procedure  described  in  this  section  was  used  to  (a)  compare  the  LLRM 
and  QLRM  and  select  which  model  best  describes  the  relationship  between  composite 
score  and  FPPS  in  each  “A”  School  sample,  (b)  determine  whether  a  PDR  exists  in  each 
sample,  and  (c)  compare  the  selected  LRM  (either  QLRM  or  LLRM)  with  the  Bin  model 
and  determine  whether  the  Logistic  Regression  Model  (LRM)  or  Bin  model  best  fits  the 
data. 

Inclusion  of  the  QLRM  in  this  study  stems  from  the  central  role  of  the  PDR  concept 
in  RIDE  and  the  need  to  objectively  test  for  the  presence  of  a  PDR.  Thus,  the  choice 
between  LLRM  and  QLRM  provides  an  objective  test  for  determining  whether  a  PDR 
exists.  If  the  test  indicates  that  a  QLRM  (that  also  has  certain  characteristics  described 
below)  best  models  the  relationship  between  composite  score  and  FPPS,  then  there  is 
statistical  evidence  that  a  PDR  exists.  On  the  other  hand,  if  the  test  indicates  that  the 
LLRM  best  models  the  relationship  between  composite  score  and  FPPS,  then  there  is 
statistical  evidence  that  a  PDR  does  not  exist. 

s?j  =  f1  +  exp[-  («2 jKj  +  «i jxij  +  «o j )  ] 

is  the  QLRM  for  the  conditional  probability  of  FPPS  with  respect  to  individual  i  in  job  option  j. 
(Jkj  is  the  coefficient  of  X\ ,  (k  =  0,  1,  2),  and  Xtj  is  the  score  of  individual  i  on  the  ASVAB 
composite  for  job  option  j. 

The  QLRM  has  exactly  one  extreme  value  point,  which  may  be  either  a  maximum  or 
a  minimum.  As  demonstrated  in  Appendix  B,  the  extreme  value  point  is  a  minimum  if 
a2j  >  o  and  is  a  maximum  if  a2j  <  o.  Two  distinct  QLRM  sub  models  resulted  from 
fitting  the  generic  QLRM  to  the  70  “A”  School  samples.  One  (QLRM  #1)  is  consistent 
with  the  assumption  of  a  monotonic  increasing  relationship  between  composite  score 
and  FPPS  over  the  interval  between  the  cut  score  (CS)  and  the  maximum  observed 
composite  score  in  the  sample  ( Cmox ).  Hereafter,  denote  this  interval  as  (CS,  Cmox].  The 
second  (QLRM  #2)  is  an  acceptable  QLRM  because  it  is  consistent  with  the  PDR 
concept.  Hypothetical  curves  for  QLRM  #1  and  QLRM  #2  are  illustrated  in  Figure  2.  It 
is  assumed  that  CS  =  120  and  CMax=  160  for  all  curves  in  Figure  2. 
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Figure  2.  Logistic  regression  models. 

QLRM  #1  characteristics:  a2j  >  o  and  so  the  extreme  value  point  Xm in  is  a  minimum. 
Also,  Xmox  <  CS  <  Cmclx-  This  is  shown  in  Figure  2,  where  Xm in  =110,  and  so  the  model  is 
monotonic  increasing  on  [CS,  Cmo*]. 

QLRM  #2  characteristics:  a2j  <  o  and  so  the  extreme  value  point  Xmox  is  a  maximum. 
In  addition,  CS  <  Xmox  <  Cmox .  This  is  illustrated  in  Figure  2,  where  Xmox  =  140  is  a  PDR, 
since  the  relationship  between  X  and  FPPS  is  monotonic  increasing  on  [CS,  Xmox ]  and 
monotonic  decreasing  (MD)  on  [Xmox,  CW]. 

The  monotonic  character  of  the  LLRM  over  the  entire  composite  score  range  makes 
it  appropriate  in  the  context  of  using  aptitude  test  scores  to  predict  a  dichotomous 
training  school  success  measure  such  as  FPPS. 

Sfj  =  {l  +  exp[-  (p]  jXi  j  +  J30  j )  ] }  '  is  the  LLRM  for  the  conditional  probability  of  FPPS 

with  respect  to  individual  i  in  job  option  j.  /L,;  is  the  coefficient  of  XkUj  (k  =  o,  1),  and  Xij 

is  the  score  of  individual  i  on  the  ASVAB  composite  for  job  option  j.  As  shown  in  Figure 
2,  there  is  no  extreme  value  point  associated  with  the  LLRM,  and  hence  it  is  monotonic 
over  the  entire  composite  score  range.  The  LLRM  is  monotonic  increasing  (monotonic 
decreasing)  if  /3i j  is  positive  (negative). 

The  following  criteria  were  used  to  choose  between  LLRM  and  QLRM: 

•  With  the  exception  of  QLRM  #2,  the  model  should  be  monotonic  increasing  on 
CS,  Cmox.  This  consideration  is  based  upon  the  assumption  that  the  relationship 
between  composite  score  and  FPPS  should,  in  general,  be  monotonic  increasing. 
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•  P-value  test:  We  choose  between  LLRM  and  QLRM  based  primarily  on  the  p- 
values  associated  the  highest  degree  parameter  in  the  respective  models.  The 
highest  degree  parameter  of  the  LLRM  is fhj,  whereas  the  highest  degree 
parameter  of  the  QLRM  is  a2j.  Comparison  of  their  p-values  indicates  which 
parameter  we  may  conclude,  with  the  greatest  degree  of  confidence,  is  unequal  to 
zero,  and,  consequently,  whether  the  QLRM  or  LLRM  best  fits  the  data.  See 
Appendix  A  for  discussion  on  the  interpretation  of  p-values. 

Define  {3,j  as  the  LLRM  estimate  of  fhj  and  a2  ]  as  the  QLRM  estimate  of  a2j.  For  the 

“A”  School  sample  associated  with  job  option  j,  we  use  the  p-value  to  make  a  preliminary 
choice  between  the  QLRM  and  LLRM  by  applying  the  following  decision  rules: 

If  p-value  ( a2  j )  <  p-value  (flf,  we  may  conclude  with  greater  confidence  that  a2j  is 

unequal  to  zero  than  we  could  that  fhj  is  unequal  to  zero.  Thus,  our  preliminary  choice  is 
QLRM,  which  we  finalize  by  performing  steps  l  through  3: 

1.  If  the  QLRM  satisfies  the  characteristics  of  QLRM  categories  #1  or  #2  and  if  no 
errors  were  detected  during  model  fit,  the  model  is  declared  as  QLRM.  If,  in 
addition,  the  QLRM  satisfies  the  characteristics  of  QLRM  #2,  then  a  PDR  is 
declared  to  exist. 

2.  If  (1)  is  not  satisfied,  the  final  model  choice  is  LLRM,  provided  that  /?,  j  >0  and 
no  error  conditions  were  detected  during  parameter  estimation. 

3.  If  (2)  is  not  satisfied,  then  the  model  is  declared  “No  Decision,”  indicating  that 
neither  QLRM  nor  LLRM  provides  a  satisfactory  fit. 

If  p-value  ( a2  j  )  >  p-value  (/hj),  then  our  preliminary  model  choice  is  LLRM.  That 

decision  becomes  final  if  fhj  >  o  and  no  error  conditions  were  detected  during 
parameter  estimation.  However,  if  J3ij  <  o  or  at  least  one  error  is  detected,  the  model  is 
declared  “No  Decision.” 

Once  the  QLRM  vs.  LLRM  winner  is  selected,  it  is  compared  with  the  Stage  II  bin 
model.  The  (non-standardized)  Stage  II  model,  rather  than  the  (standardized)  Stage  III 
model,  is  compared  with  the  QLRM-LLRM  winner  because  the  basis  for  comparison  is 
accuracy  of  conditional  probability  of  FPPS  prediction. 

The  “expected  absolute  total  error”  (EATE)  criterion  was  used  to  compare  the  Bin 
and  LRM  models.  As  described  under  Criticisms  of  RIDE  SPSU  Model,  the  Bin  model 
construction  process  introduces  both  bias  and  estimation  error  into  its  estimate  of 
conditional  probability  of  FPPS.  In  contrast,  the  asymptotic  unbiased  property  of 
maximum  likelihood  estimators  (MLE)  means  that  the  logistic  regression  parameter 
estimates  are  unbiased  in  the  limit  as  the  sample  size  becomes  large  (Stuart  &  Ord, 

1991).  (The  author  is  not  aware  of  any  studies  indicating  whether,  for  a  fixed  sample 
size,  the  LRM  parameter  MLEs  are  still  unbiased  and,  if  not,  the  degree  of  bias  present.) 
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Preliminary  Bin  vs.  LRM  comparisons  were  performed  using  95  percent  confidence 
intervals.  Overall,  these  results  indicated  that  the  LRM  had  slightly  narrower  confidence 
interval  widths  than  the  Bin  model.  However,  since  estimator  bias  is  not  considered  in 
the  confidence  interval  calculation,  a  criterion  was  sought  that  would  incorporate  both 
bias  and  estimation  error  variance  into  the  comparison.  The  EATE  criterion  was 
developed  by  assuming  that  e  (epsilon,  the  total  FPPS  rate  estimation  error  due  to  the 
presence  of  both  bias  and  FPPS  rate  estimation  error  variance)  is  normally  distributed 
with  mean  equal  to  the  bias  and  variance  equal  to  the  estimation  error  variance. 
Mathematically,  EATE  is  the  expected  value  of  the  absolute  value  of  s  (  E  |  s  |  )  and  is 
given  by 


EATE  =  E|  s  |  =  2  <j  (p 


E 

<j) 


\  c 

-2//0 


V 


-1 

Oj 


+  ju ,  where 


a2  is  estimation  error  variance, 

/i  is  the  bias  of  the  estimator, 

<p( )  is  the  standard  normal  probability  density  function,  and 

0( )  is  the  standard  normal  cumulative  distribution  function. 

This  formula  is  used  to  calculate  EATE  of  the  logit  in  the  LRM  model,  and  the  EATE 
of  the  Bin  model.  The  derivation  of  EATE  is  given  in  Appendix  A,  as  are  the  procedural 
details  of  the  Bin  vs.  LRM  comparison. 

Table  1  summarizes  the  results  of  the  analyses.  The  Bin  columns  indicate  the  mean  of 
the  Bin  model  EATEs  for  each  rating.  Each  mean  was  computed  by  averaging  the  EATEs 
over  all  integer  composite  scores  between  the  cut  score  and  the  Cmox  in  that  rating.  The 
LRM  columns  indicate  the  mean  of  the  LRM  model  EATEs  for  each  rating,  again 
computed  by  averaging  over  all  composite  scores  between  the  cut  score  and  Cmox  for  that 
rating.  When  these  columns  were  averaged  over  all  ratings,  the  mean  Bin  EATE  was 
0.050  and  the  mean  LRM  EATE  was  0.030.  The  B/L  columns  indicate  whether  the  Bin 
model  or  LRM  model  produced  the  smaller  EATE.  In  this  comparison,  the  Bin  model 
produced  the  smaller  EATE  only  7  times,  while  the  LRM  produced  the  smaller  EATE  62 
times.  The  Model  column  indicates  which  LRM  (QLRM  or  LLRM)  was  the  superior 
LRM  for  that  rating  and  was  matched  against  the  Bin  model  in  the  EATE  comparison. 
The  appearance  of  (PDR)  in  that  column  indicates  that  the  chosen  QLRM  satisfied  the 
criteria  for  the  existence  of  a  PDR.1  Six  of  the  70  LRMs  were  QLRM,  55  of  them  were 
LLRM,  and  the  remaining  9  were  “No  Decision.”2  Five  of  the  six  ratings  that  satisfied 
the  QLRM  criteria  also  satisfied  the  conditions  for  the  existence  of  a  PDR.  These  ratings 
are  designated  by  an  asterisk  (*)  in  the  Model  columns. 


1  Note:  The  LLRM  was  matched  against  the  Bin  model  whenever  the  QLRM  vs.  LLRM  comparison 
resulted  in  a  “No  Decision”  outcome. 

2  No  Bin  vs.  LRM  comparison  was  performed  for  the  PH  5Y  rating  because  only  one  bin  resulted  when  the 
bin  construction  procedure  was  performed  on  that  sample.  At  least  2  bins  are  necessary  to  calculate  the 
Stage  I  estimate. 
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Table  1 
Bins  vs.  LRM 


Rate 

Bin 

LRM 

B/L 

Model 

Rate 

Bin 

LRM 

B/L 

Model 

AB-GE 

.065 

.048 

L 

QLRM* 

AC-5Y 

.097 

.037 

L 

LLRM 

AD-SG 

.012 

.010 

L 

LLRM 

AE-SG 

.022 

.013 

L 

LLRM 

AECFAE 

.048 

.015 

L 

LLRM 

AG-SG 

.100 

.032 

L 

LLRM 

AK-SG 

.027 

.021 

L 

NoDe 

AM-GE 

.013 

.008 

L 

LLRM 

AO-SG 

.061 

.020 

L 

LLRM 

AS-SG 

.119 

.056 

L 

NoDe 

AT-GE 

.042 

.017 

L 

LLRM 

AZ-SG 

.020 

.013 

L 

LLRM 

BU-5Y 

.085 

.042 

L 

QLRM* 

CE--5Y 

.040 

.020 

L 

LLRM 

CM-5Y 

.064 

.025 

L 

LLRM 

CTA-SG 

.017 

.017 

L 

LLRM 

CTI-SG 

.119 

.031 

L 

LLRM 

CTM-AE 

.046 

.024 

L 

LLRM 

cto-sg 

.032 

.033 

B 

QLRM* 

CTR-SG 

.053 

.035 

L 

NoDe 

CTT-SG 

.015 

.010 

L 

LLRM 

DC-SG 

.023 

.013 

L 

LLRM 

DK-SG 

.060 

.035 

L 

NoDe 

DT-GE 

.021 

.010 

L 

LLRM 

EA-5Y 

.055 

.039 

L 

LLRM 

EM-SG 

.045 

.022 

L 

LLRM 

EN-SG 

.060 

.023 

L 

LLRM 

EN-AT 

.108 

.035 

L 

LLRM 

EO-5Y 

.015 

.010 

L 

LLRM 

ETS-GE 

.051 

.019 

L 

LLRM 

EW-SG 

.036 

.027 

L 

LLRM 

EW-AE 

.138 

.048 

L 

LLRM 

FT-GE 

.028 

.024 

L 

LLRM 

GM-SG 

.053 

.034 

L 

LLRM 

GSE-GE 

.083 

.040 

L 

LLRM 

GSM-GE 

.075 

.028 

L 

LLRM 

HM-GE 

.027 

.008 

L 

LLRM 

HT-GE 

.028 

.023 

L 

QLRM 

1C— GE 

.111 

.032 

L 

LLRM 

IS— SG 

.043 

.025 

L 

LLRM 

J  0— 5Y 

.058 

.044 

L 

LLRM 

LI— SG 

.025 

.050 

B 

LLRM 

MM-SG 

.029 

.014 

L 

LLRM 

MM-NF 

.069 

.085 

B 

LLRM 

MMS-SG 

.078 

.020 

L 

LLRM 

MN-SG 

.038 

.029 

L 

NoDe 

MR-SG 

.040 

.036 

L 

LLRM 

MS-SG 

.072 

.019 

L 

NoDe 

MSS-SG 

.070 

.049 

L 

LLRM 

MT-AE 

.036 

.027 

L 

LLRM 

— NF 

.016 

.022 

B 

LLRM 

OS-SG 

.008 

.007 

L 

LLRM 

PH-5Y 

— — 

.... 

... 

QLRM* 

PN-SG 

.027 

.020 

L 

LLRM 

PR--SG 

.011 

.015 

B 

LLRM 

QM-SG 

.044 

.038 

L 

LLRM 

RM-SG 

.029 

.012 

L 

LLRM 

RP-SG 

.048 

.048 

B 

NoDe 

SH-SG 

.032 

.021 

L 

LLRM 

SK-SG 

.028 

.019 

L 

NoDe 

SKS-SG 

.101 

.094 

L 

LLRM 

SM-SG 

.030 

.024 

L 

LLRM 

SS-SF 

.040 

.016 

L 

LLRM 

STG-GE 

.007 

.007 

L 

LLRM 

STS-GE 

.068 

.022 

L 

LLRM 

SW-5Y 

.074 

.042 

L 

NoDe 

TM-SG 

.065 

.041 

L 

LLRM 

UT-5Y 

.037 

.038 

B 

QLRM* 

YN-SG 

.033 

.013 

L 

LLRM 

YNS-SG 

.094 

.071 

L 

LLRM 
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School  Success  Component  (SSC)  of  CLASP 


The  school  success  utility  component  predicts  “A”  School  success  as  a  function  of  the 
operational  ASVAB  selector  composite  for  each  rating.  Prior  to  CLASP,  classifiers  made 
“A”  School  assignments  based  on  the  cut  score  for  each  rating,  without  considering  the 
degree  to  which  the  applicant  may  exceed  that  score.  Based  upon  the  assumption  that  an 
applicant's  likelihood  of  success  increases  with  aptitude  test  score,  the  school  success 
component  was  designed  to  incorporate  information  about  the  complete  range  of  scores, 
instead  of  focusing  solely  on  whether  the  cut  score  was  satisfied.  For  the  original  CLASP 
implementation  in  the  early  1980s,  Navy  validation  samples  were  obtained  from  Paul 
Foley  of  Navy  Personnel  Research  and  Development  Center  (NPRDC).  Linear  regression 
analyses  were  performed  to  develop  unique  school  success  equations  for  ratings  in 
which  validation  data  was  available.  Thus,  in  the  original  CLASP  implementation, 
different  selector  composites  were  used  to  predict  school  success  for  different  ratings. 
The  original  equations  were  typically  characterized  by  non-integer  weights  and,  in  some 
instances,  negative  weights. 

In  1984,  a  new  policy  allowed  only  operational  ASVAB  selector  composites  to  be  used 
as  school  success  equations  in  CLASP.  Therefore,  the  current  school  success  equation 
for  each  job  option  is  identical  to  the  ASVAB  composite  currently  used  for  selection 
purposes.  Accordingly,  CLASP  school  success  criterion  measures  vary  from  job  option  to 
job  option.  For  a  given  job  option,  the  school  success  criterion  is  determined  by  the  “A” 
School  performance  measure  used  in  the  ASVAB  validation  study  that  recommended 
use  of  that  particular  composite.  Whenever  an  ASVAB  validation  study  recommends 
that  the  ASVAB  composite(s)  currently  used  for  selection  and/or  the  associated  cut 
score(s)  be  replaced,  NPRST  immediately  submits  for  operational  CLASP 
implementation  an  updated  school  success  mean  and  standard  deviation  for  each  job 
option  associated  with  the  rating.  When  an  ASVAB  selector  composite  change  is 
recommended  and  approved,  Commander,  Navy  Recruiting  Command  (CNRC)  then 
changes  the  corresponding  school  success  equation(s)  in  the  operational  CLASP 
implementation.  Several  “A”  Schools  select  students  using  multiple  composites  and  cut 
scores,  either  as  a  multiple  hurdle  or  as  an  "either/or"  criterion.  For  CLASP  job  options 
associated  with  these  ratings,  one  composite  is  designated  by  NPRST  as  the  CLASP 
school  success  equation. 

The  standardized  school  success  payoff  for  individual  i  in  rating  j  is  given  by 


Sij  =50  +  10 


&i,j  Mss,j  ^ 


ssj 


Sch  Sue 


where 

S*j  is  the  standardized  school  success  payoff  associated  with  placing  individual  i  in 
rating  j, 

Sij  is  the  ASVAB  composite  score  for  individual  i  in  rating  j, 

/j.ssj  is  the  Sij  reference  population  mean  for  rating  j,  and 

assj  is  the  Sij  reference  population  standard  deviation  for  rating  j. 
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The  reference  population  used  to  estimate  the  mean  and  standard  deviation  consists 
only  of  recruits  who  satisfy  the  ASVAB  selector  criteria  for  that  rating,  not  the  entire 
recruit  population.  Subtracting  /jss.j  from  Sij  and  dividing  that  difference  by  ossj  in 
equation  Sch_Suc  adjusts  each  Sij  for  differences  across  ratings  in  the  average  ability 
level  required  to  quality  for  and  successfully  complete  “A”  School.  As  a  result,  equation 
Sch_Suc  transforms  the  Sij  into  a  common  metric  for  all  ratings  and  facilitates 
comparison  across  ratings  for  individual  i.  However,  it  is  not  known  if  conversion  to  this 
common  metric  is  sufficient  to  completely  eliminate  the  tendency  for  the  easier  schools 
to  experience  higher  school  success  utility  scores,  on  the  average. 

CLASP  Parameter  Update  Considerations 

The  reference  population  means  and  standard  deviations  for  each  job  category  are 
the  only  School  Success  component  parameters  subject  to  updating.  The  CLASP 
parameter  update  software  automatically  generates  an  update  for  these  parameters 
during  the  annual  CLASP  parameter  update.  In  addition,  NPRST  possesses  a  software 
package  to  update  any  specified  subset  of  the  school  success  mean  and  standard 
deviation  parameters  when  ASVAB  selector  composite  and/or  cut  score  changes  have 
been  recommended  and  approved. 

Rl  DE  SPSU  and  CLASP  SSC  Summary 

The  section  closes  with  a  discussion  of  several  important  considerations  in  building 
and  maintaining  the  SPSU  component  of  RIDE.  Also  included  is  a  description  of 
strengths  and  weakness  of  SSC  and  SPSU. 

CLASP  SSC  Weaknesses 

School  success  equations  in  the  current  CLASP  implementation  are  chosen  from  a 
short  list  of  approximately  12  ASVAB  (unique)  selector  composites.  This  small  number 
of  unique  composites,  relative  to  the  approximately  120-130  job  options  currently  sold 
in  CLASP,  means  that  the  same  composite  is  used  for  several  job  options.  For  example, 
as  of  March  2003,  CLASP  used  Verbal  and  Arithmetic  Reasoning  (VE+AR)  to  predict 
school  success  in  17  job  options  and  Arithmetic  Reasoning,  Math  Knowledge, 

Electronics  Information,  and  General  Science  (AR+MK+EI+GS)  in  30  job  options. 
Hence,  the  SSC  has  a  limited  differential  prediction  capability,  meaning  that  it  cannot 
distinguish  differences  in  school  success  utility  between  pairs  of  job  options  using  the 
same  equation.  A  partial  solution  may  be  achieved  in  job  options  that  use  multiple 
composites  for  selection,  either  as  a  multiple  hurdle  or  as  an  "either/or"  criterion.  If 
appropriate  weights  could  be  found,  additional  school  success  equations  could  be 
created  by  taking  a  weighted  sum  of  all  composites  appearing  in  the  “A”  School  selection 
standards  for  these  job  options.  The  number  of  CLASP  job  options  sharing  the  same 
composite  could  be  reduced  substantially.  In  addition  to  concerns  regarding  the  quality 
of  differential  prediction,  the  SSC  lacks  the  flexibility  to  implement  anything  other  than 
a  linear  relationship  between  composite  score  and  utility. 
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CLASP  SSC  Strengths 


The  advantage  of  the  SSC  is  that  only  Navy  applicant  data  from  PRIDE  is  required  to 
update  the  CLASP  parameters,  including  SSC  mean  and  standard  deviation  parameters. 
When  an  ASVAB  validation  study  recommends  a  selector  composite  and/or  cut  score 
change  for  a  given  rating,  CLASP  does  not  require  a  new  prediction  model.  CLASP 
requires  only  that  mean  and  standard  deviation  parameters  for  that  rating  be  updated 
based  upon  the  new  composite  and/or  cut  score.  NPRST  uses  a  simple  procedure  to 
estimate  the  new  parameters  and  forward  them  for  implementation.  “A”  School 
validation  samples  are  not  required  for  this  purpose. 

Rl  DE  SPSU  Weaknesses 

Development  and  maintenance  of  bin  and/or  logistic  regression  models  for 
predicting  FPPS  requires  an  “A”  School  validation  sample  for  each  rating.  As  is  currently 
the  case  with  CLASP,  when  an  ASVAB  validation  study  recommends  a  change  to  the 
operational  selector  composite  in  a  given  rating,  a  corresponding  change  to  the  SPSU 
component  of  RIDE  will  be  required.  However,  unlike  CLASP,  the  RIDE  parameter 
update  requires  estimation  of  both  a  new  PDR  and  the  FPPS  rate  at  the  new  PDR. 

School  performance  data  would  be  required  to  accomplish  this.  In  addition,  it  is 
anticipated  that  in  some  situations,  such  changes  may  be  more  difficult  and  time- 
consuming  than  is  currently  the  case  with  CLASP.  When  a  selector  composite  change  is 
recommended  and  approved  for  a  given  rating,  it  may  not  be  advisable  to  immediately 
develop  and  implement  a  new  FPPS  prediction  model  for  that  rating  using  the  currently 
available  validation  sample  and  the  replacement  (i.e.,  new)  selector  composite.  This  will 
be  especially  true  if  the  incumbent  and  replacement  composites  will  select  student 
populations  that  are  significantly  different  from  one  another.  Accordingly,  it  may  not  be 
feasible  to  implement  the  new  selector  composite  in  RIDE  until  after  sufficient  students 
have  been  selected  with  the  replacement  composite  to  develop  and  implement  a  new 
prediction  model. 

Rl  DE  SPSU  Strengths 

Availability  of  “A”  School  performance  data  will  facilitate  development  of  non-linear 
models  of  the  relationship  between  composite  score  and  school  performance.  It  will  also 
facilitate  development  of  unique  SPSU  equations  for  more  job  options  than  is  currently 
feasible  in  CLASP.  Although  previous  sections  raised  several  questions  concerning  the 
quality  of  the  FPPS  criterion  and  the  quality  of  the  Bin  and  LRM  estimators  of  the 
conditional  probability  of  FPPS,  the  availability  of  school  performance  data  would 
facilitate  further  research  on  criterion  measure  alternatives  to  FPPS. 
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CLASP  Aptitude/  Difficulty  and  Rl  DE  AFQT  Component 

Comparison 


In  ascertaining  whether  an  applicant  is  suited  to  a  particular  job,  the  employer  must 
assess  the  job's  requirements  and  the  applicant's  abilities.  The  employer  must  decide 
whether  the  prospective  employee  has  the  abilities  required  to  succeed  in  the  job. 

During  a  typical  employment  interview,  the  employer  judges  the  applicant's  abilities 
using  some  internal  scale.  The  internal  scale  may  not  be  well  defined,  but  allows  the 
employer  to  evaluate  and  rank-order  prospective  employees.  The  employer  can  be  more 
certain  about  the  characteristics  of  the  job  and  the  type  of  person  most  likely  to  fill  the 
job  successfully.  The  employer's  experience  enables  him  to  rank-order  jobs  based  on  the 
technical  ability  they  require.  This  continuum  forms  a  second  scale.  For  example,  an 
employer  may  judge  that  a  particular  applicant  belongs  to  the  upper  25  percent  of 
applicants,  as  assessed  on  the  internal  aptitude  scale.  A  particular  job  may  be  rated  by 
the  employer  as  belonging  to  the  upper  25  percent  of  jobs  on  the  scale  of  technical 
aptitude  required  to  succeed.  Having  established  the  relative  positions  of  both  the  job 
and  the  applicant  on  their  respective  scales,  the  employer  may  judge  their 
correspondence  to  each  other.  In  this  case,  there  appears  to  be  a  match  and  the 
applicant  will  likely  be  offered  the  job. 

The  Aptitude/Difficulty  component  of  CLASP  works  similarly  to  the  employer's 
evaluative  techniques.  This  utility  function  involves  two  scales:  (1)  a  measure  of  an 
applicant's  overall  technical  aptitude,  and  (2)  a  measure  of  the  rating's  technical 
difficulty  or  complexity.  Thus,  given  an  applicant's  technical  aptitude  and  a  rating's 
technical  difficulty,  the  utility  of  that  person-job  match  may  be  evaluated  and  compared 
with  other  possible  person-job  matchups. 

Kroeker  and  Rafacz  (1983)  provide  details  concerning  the  technical  aptitude  and  job 
difficulty  scales.  The  technical  aptitude  composite  (TAC),  computed  as  MC+AS+EI+GS, 
measures  the  applicant's  technical  aptitude  for  purposes  of  the  Aptitude/Difficulty 
component.  The  following  equation  transforms  the  TAC  so  the  resulting  transformed 
aptitude  score  (TAS)  is  between  40  and  too,  inclusive. 


f 

4  =40  +  60 

V 


c,.  -180 
280-180 


\ 

J 


Apt_Dif_TAS 


Truncate  to  A;  =  too  if  C,  >  280  and  truncate  to  A,  =  40  if  C;  >  180,  where  A,  and  Ci 
are  the  TAS  and  TAC  scores  for  individual  i,  respectively.  The  TAS  distribution  must  fall 
in  this  range  because  the  Aptitude/ Difficulty  utility  function  described  below  is 
constructed  such  that  its  aptitude  argument  must  satisfy  this  property. 

As  indicated  by  equation  Apt_Dif_TAS,  the  transformation  truncates  TAC  scores 
that  are  either  less  than  180  or  greater  than  280  so  that  they  fall  at  the  extremes  of  the 
TAS  distribution.  Since  the  minimum  standard  score  of  each  subtest  is  20  and  the 
maximum  standard  score  is  80,  the  minimum  TAC  score  is  4  x  20  =  80  and  the 
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maximum  is  4  x  80  =  320.  Consequently,  TAC  scores  between  80  and  180  correspond  to 
a  score  of  40  on  the  TAS,  while  TAC  scores  between  280  and  320  correspond  to  a  TAS 
score  of  too.  The  original  rationale  for  the  truncation  in  equation  Apt_Dif_TAS  is 
unknown,  but  its  apparent  effect  is  to  prevent  the  TAS  distribution  from  being  tightly 
concentrated  around  its  mean  and  to  spread  it  more  uniformly  over  the  range  between 
40  and  too. 

The  job  difficulty  scale  was  established  using  paired  comparison  methodology. 
(Kroeker,  personal  communication,  1998).  Initial  scale  values  were  produced  for  the 
complete  job  set  by  applying  the  paired  comparison  procedure  to  two  data  sets:  (1) 
experimenter  judgments  about  the  cognitive  skills  required  by  each  job,  and  (2) 
experimenter  estimates  of  the  visual  perceptual  attributes  required.  Data  were  then 
collected  from  subject  matter  experts  (SMEs)  who  were  asked  to  compare  the  job 
difficulty  of  small  groups  of  ratings.  The  SMEs  ranked  the  difficulty  of  8  to  10  jobs  in 
pairs,  thus  contributing  to  a  matrix  from  which  new  scale  values  could  be  derived  for  the 
entire  job  set.  The  scale  was  then  modified  by  using  an  iterative  procedure  to  revise 
psychological  values  (Kroeker,  1982). 

The  unstandardized  technical  aptitude/job  difficulty  utility  associated  with  assigning 
person  i  to  job  j  is  given  by: 

Equation  Apt_Dif_Util: 

Uaib{a,<di)=b»  +bV(A, -100)2  +Bv(Dj -35)+ B2a(A,  -100)!(fl,_„)2 

+  1 00)’ (O,  -35)+Bli,(o,  35)’  where 

Bo,o  =  30.0,  B2,0  =  -0.0005,  B0,i  =  1.867,  B2,2  =  -0.00001696,  B2,i  =  -0.0001867 

and  B0, 2  =  -0.01244,  Ua/d(A2,  Dj )  is  the  raw  Aptitude/Difficulty  utility  of  assigning 

person  i  to  job  j,  A,  is  the  TAS  score  of  person  i,  and  Dj  is  the  job  difficulty  of  rating  j. 

The  following  briefly  explains  the  development  of  equation  Apt_Dif_Util.  Ward 
(1977)  is  an  excellent  source  reference  for  this  topic.  The  classification  policymaker 
assumed  that  Ua/d(A2,  Dj)  is  a  polynomial  in  two  variables:  applicant  aptitude  A,-  and  job 
difficulty  Dj.  The  maximum  degrees  of  A,  and  Dj  of  the  (bivariate)  polynomial  are 
determined  by  the  number  of  initial  conditions,  as  described  below.  Hence,  it  was 
originally  specified  as 

Ua,M’Dj)=  -35)1  Apt_Dif_Poly 

i= 0  y=0 

There  are  (2+1)  x  (2+1)  =  9  unknown  coefficients  (B0,0,  B0,i,  B0,2,  Bi,0,  Bi,i,  Bi>2,  B2,0, 
B2,i,  and  B2,2)  to  be  determined.  Step  2  specifies  a  set  of  initial  conditions  (either  on  the 
utility  function  itself  or  on  its  partial  derivatives)  at  critical  values  of  A  and  D.  For 
example,  equation  (Apt_Dif_Poly)  was  developed  using  a  set  of  initial  conditions  similar 
to  the  following: 
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(1)  Ua/d(40,40)  =  32-34 

dU 

(2)  - —  =  o  when  evaluated  at  A  =  40,  D  =  43.1 

d  D 

(3)  UA/d(40,ioo)  =  -204.65 

(4)  Ua/d(ioo,ioo)  =  98.8 

(5)  UA/d(ioo,4o)  =  39.02 

dU 

(6)  - —  =  o  when  evaluated  at  A  =  D  =  too 

dD 

(7)  Ua/d(70,40)  =  37-35 

(8)  Ua/d(70,100)  =  22.93 

dU 

(9)  - —  =  o  when  evaluated  at  A  =  70,  D  =  65.7 

dD 


Note  the  number  of  initial  conditions  equals  the  number  of  unknown  coefficients. 
When  the  initial  conditions  are  substituted  into  Apt_Dif_Poly,  the  result  is  a  system  of  9 
linear  equations  in  the  9  unknown  coefficients.  The  coefficients  may  be  determined  by 
solving  the  linear  system. 

As  described  by  Ward  (1977),  the  initial  conditions  are  based  upon  policymaker 
requirements  regarding  the  desired  behavior  of  the  function  at  pre-specified  values  of  A 
and  D.  Judicious  choices  in  the  initial  condition  specification  will  give  Ua/d(A,D )  its 
desired  appearance  over  the  entire  range  of  allowable  values  of  A  and  D. 

As  far  as  the  author  can  determine,  the  Aptitude/Difficulty,  Priority/Preference,  and 
Attrition  Component  Utility  functions  were  all  developed  as  mathematical 
representations  of  personnel  classification  policy.  For  example,  the  Aptitude/Difficulty 
utility  function  is  based  upon  policymaker  judgments  concerning  the  value  to  the  Navy 
of  assigning  an  individual  with  a  given  technical  aptitude  level  to  a  job  with  a  given  level 
of  technical  difficulty.  The  A/D,  P/P,  and  Attrition  utility  functions  do  not  appear  to 
either  represent  the  outcome  or  results  of  any  empirical  study  or  to  be  motivated  by  any 
such  study.  The  author  is  not  aware  of  any  research,  either  inside  or  outside  the  military, 
which  has  produced  an  empirically-based  model  describing  utility  as  a  function  of  a 
person  attribute  and  a  job  attribute.  As  described  below,  similar  procedures  were  used 
to  determine  the  coefficients  for  the  raw  utility  functions  in  the  Priority/Preference  and 
the  Attrition  components  of  CLASP. 

Figure  3  shows  a  graph  of  equation  Apt_Dif_Util  with  Ua/d(A,D )  plotted  as  a 
function  of  Job  Difficulty  for  fixed  applicant  Aptitude  values  A  =  40,  50,  60,  80,  90,  and 
99.  The  uppermost  curve  on  Figure  3  represents  the  utility  values  for  the  highest 
technical  aptitude  level  (99)  across  the  entire  range  of  job  difficulty.  The  region  at  which 
the  curve  assumes  its  maximum  value  occurs  at  the  upper  end  of  the  difficulty  scale. 

This  implies  the  utility  function  tends  to  assign  the  highest  aptitude  individuals  to  the 
most  technically  complex  ratings.  The  curve's  gradual  downward  slope  from  the  region 
of  greatest  technical  difficulty  to  the  region  of  least  technical  difficulty  implies  that 
smaller  utility  values  are  awarded  when  high-aptitude  individuals  are  assigned  to  low- 
difficulty  jobs.  Although  the  probability  of  such  assignments  is  reduced  accordingly, 
they  may  still  take  place,  due  to  the  influence  of  the  other  CLASP  components.  The 
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lowest  curve  represents  the  utility  values  for  the  lowest  technical  aptitude  level  (40) 
across  the  entire  range  of  job  difficulty.  Its  maximum  value  occurs  at  the  lowest  end  of 
the  difficulty  scale;  its  sharply  downward  slope  in  the  direction  of  increasing  job 
difficulty  means  that  low-ability  applicants  will  almost  always  be  assigned  to  the  least 
complex  jobs. 


—  *—  A  =  40 
—■—A  =  50 
— A — A  =  65 
— A  =  80 


Figure  3.  Aptitude/  Difficulty  Utility. 

The  middle  curve  indicates  that  applicants  of  average  ability  (65)  have  a  reasonable 
chance  to  be  assigned  to  ratings  of  all  difficulty  levels.  However,  given  that  the 
maximum  of  this  curve  occurs  in  the  range  of  intermediate  job  difficulty,  it  is  most  likely 
they  will  be  assigned  to  ratings  of  intermediate  technical  difficulty. 

Thus,  for  a  given  level  of  applicant  aptitude,  the  Aptitude/Difficulty  component 
awards  the  largest  utility  values  to  assignments  providing  the  closest  correspondence 
between  the  applicant's  ranking  on  the  technical  aptitude  score  distribution  and  the 
job's  ranking  on  the  job  difficulty  distribution.  In  other  words,  the  largest  utility  values 
are  awarded  when  high  aptitude  applicants  are  matched  up  with  most  difficult  jobs. 
Intermediate  aptitude  applicants  are  awarded  the  largest  utility  when  they  are  matched 
with  intermediate  difficulty  jobs,  although  the  utility  of  this  matchup  is  not  as  large  as 
that  of  the  high  aptitude  individual  and  high  difficulty  job.  Low  aptitude  applicants  are 
awarded  the  largest  utility  when  they  are  matched  with  low  difficulty  jobs,  although  the 
utility  of  this  matchup  is  not  as  large  as  that  of  the  intermediate  aptitude  individual  and 
intermediate  difficulty  job.  Table  D-i  in  Appendix  D  shows,  for  each  fixed  A,  the 
difficulty  level  Dmox(A )  that  maximizes  Ua/d(A,D).  That  is,  for  any  fixed  A,  DMaxiA)  is  the 
difficulty  level  D  such  that  Ua/d(A,  DMaxiA ))  >  Ua/d{A,D)  for  all  D  in  [40,99].  As  shown 
therein,  both  DMaxiA )  and  Ua/d(A,  DMaxiA ))  are  increasing  functions  of  A. 
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In  Figure  3,  the  6  curves  for  the  6  aptitude  levels  do  not  intersect.  Thus,  for  a  given 
job  difficulty  level,  the  utility  associated  with  assigning  an  applicant  with  aptitude  Ai  is 
greater  than  the  utility  associated  with  TAS  score  A2  if  A,  >  A2.  Stated  differently,  larger 
applicant  aptitude  levels  result  in  larger  utility  values,  regardless  of  job  difficulty  level. 
As  a  general  rule,  this  seems  reasonable,  with  the  possible  exception  of  the  lowest  job 
difficulties.  One  may  argue  that  higher  applicant  aptitudes  should  result  in  smaller 
utilities  for  the  lowest  job  difficulties,  since  the  assignment  of  high  aptitude  individuals 
to  these  jobs  wastes  talent  that  could  productively  be  used  in  the  technically  more 
difficult  jobs.  Such  an  argument  could  be  used  in  support  of  the  AFQT  component  in 
RIDE. 


The  standardized  Aptitude/Difficulty  payoff  is  calculated  as: 


U‘a,d(A,D)  =  50  +  10 


uaio(a,d)- 


Mad 


AD 


Apt_Dif_Std 


CLASP  Parameter  Update  Considerations 

Mad,  oad,  and  the  job  difficulty  (i.e.,  job  complexity)  index  parameters,  Dj ,  for  each 
rating  constitute  the  Aptitude/Difficulty  component  parameters  subject  to  updating. 

The  CLASP  parameter  update  software  automatically  generates  updates  for  /had  and  oad 
during  the  annual  CLASP  parameter  update.  Kroeker  (personal  communication,  1998) 
documents  the  procedures  and  methodology  he  used  to  update  the  original  set  of  job 
difficulty  parameters  he  developed  in  the  late  1970s  or  early  1980s.  The  author  knows  of 
no  reason  why  these  updated  parameters  could  not  be  implemented  in  CLASP  at  this 
time. 

Rl  DE  AFQT  Utility 

The  purpose  of  the  RIDE  AFQT  Component  is  to  "penalize"  the  applicant's  utility 
scores  in  ratings  where  the  AFQT  score  suggests  the  applicant  is  over-qualified.  If  the 
degree  of  over-qualification  is  large  enough,  both  the  Navy's  and  the  applicant's 
interests  are  best  served  by  placement  in  a  rating  in  which  the  applicant’s  general 
aptitude  more  closely  matches  that  of  other  applicants  assigned  to  the  rating.  This 
concept  is  based  on  the  assumption  that  the  AFQT  score  represents  a  measure  of  the 
applicant's  overall,  general  aptitude,  while  the  ASVAB  selector  composite  score 
measures  specific  skills  and  aptitudes  for  the  rating. 

Figure  4  demonstrates  this  concept.  The  maximum  AFQT  utility  is  achieved  by 
individuals  whose  AFQT  score  is  <  the  mean  AFQT  score  Mjoi  individuals  assigned  to 
the  rating.  Utility  decreases  from  a  maximum  of  Qmox  =  100  to  a  minimum  of  Qm™  =  o  as 
the  individual's  AFQT  score  substantially  exceeds  the  mean  AFQT  score  for  that  rating. 
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Figure  4.  Rl  DE  AFQT  Utility. 

Define  Qij  as  the  AFQT  utility  associated  with  assigning  individual  i  to  job  option  /. 
Define  A,  as  the  AFQT  score  of  individual  i,  Mj  as  the  mean  of  the  AFQT  distribution  in 
job  option  j,  and  oj  as  the  standard  deviation  of  the  AFQT  distribution  in  job  option  /.  In 
addition,  define  Oj  >  o  as  the  offset  from  Mj  that  defines  the  maximum  AFQT  score  for 
which  Qij  =  Qmox  =  100.  Also,  define  Oj  >  Mj  +6j  as  the  minimum  AFQT  score  for  which 
Qij  =  QMin  =  o.  In  other  words,  as  A;  increases  from  o  to  100,  Mj  +  6/  represents  the 
AFQT  score  at  which  the  penalty  begins  to  take  effect,  while  Oj  is  the  AFQT  score  at 
which  the  penalty  first  reaches  its  maximum. 

In  Figure  4,  M .  =  55,  cr.  =10,  A  =  —,  and  0 ,  =M..  +  3.5cr,.. 

°  ,7  j  j  ^2  ^  ^ 

Then  M  .  +  5,  =  M  f  +  —  =  60  and  0-  =  M  .  +  3.5cr .  =90.  Note  that  the  specifications  for 

J  J  J  ^  J  J  J  x 

M j  +  8 j  and  0/  have  been  modified  since  the  Folchi  (1999)  specification.  Qi,j  may  then  be 
defined  as  follows: 


if  Ai<Mi+8i, 


Qij  Q-Max 

Qij  Qhdin  “  */  “  j 

qu  =  h-K+^l+e. 


if  Ai  >  0  ,  and 


Mj+Sj-Oj 


Max 


if  Mj  +  8 J  <  A;  <  0j  . 


22 


Rl  DE  AFQT  and  CLASP  Aptitude/  Difficulty  Summary 


In  summary,  Ua/d(A,D)  is  a  mathematical  representation  of  a  classification  policy 
that  assigns  each  applicant  to  a  rating  whose  technical  difficulty,  D,  approximately 
corresponds  with  technical  aptitude  A.  The  costs  (both  to  the  Navy  and  the  applicant)  of 
a  mismatch  seem  clear.  Worker  boredom  and  a  lost  opportunity  to  assign  individuals  to 
jobs  that  better  match  their  skills  and  aptitude  are  the  costs  associated  with  assigning 
applicants  to  jobs  that  are  too  easy.  Decreased  productivity  is  the  cost  of  assigning 
applicants  to  jobs  for  which  they  lack  the  required  aptitude  to  perform  properly. 

Comparison  of  Figures  3  and  4  indicates  the  CLASP  A/D  and  RIDE  AFQT 
components  are  quite  different.  RIDE  AFQT  penalizes  for  "over-qualification"  in  a  given 
rating  (as  measured  by  the  degree  to  which  the  applicant's  AFQT  score  exceeds  the 
M  +  8  point  in  that  rating's  AFQT  distribution).  RIDE  imposes  no  such  penalty  for 
under-qualification.  In  contrast,  CLASP  A/D  penalizes  for  "under-qualification"  (as 
measured  by  the  degree  to  which  the  applicant's  technical  aptitude  measure  is  less  than 
the  Job  Difficulty  measure  in  that  rating),  but  imposes  no  penalty  for  over-qualification. 
Regardless  of  rating,  the  RIDE  AFQT  utility  function  is  monotonically  decreasing  (flat 
between  the  minimum  AFQT  score  and  M  +  8,  downward  sloping  between  M  +  8  and  6, 
and  then  flat  between  9  and  the  maximum  AFQT  score).  In  contrast,  Figure  3  shows  the 
CLASP  A/D  function  is  monotonically  increasing  between  the  minimum  and  maximum 
values  of  A.  As  the  Job  Difficulty  increases,  Ua/d(A,D)  increases  more  rapidly  between 
the  minimum  and  maximum  values  of  A.  The  CLASP  policy  that  rewards  a  larger 
aptitude  with  a  larger  utility  value,  regardless  of  job  difficulty  level,  is  not  present  in  the 
RIDE  AFQT  model.  RIDE  rewards  larger  aptitudes  with  larger  utility  scores  only  in  the 
more  difficult  jobs. 

In  summary,  the  RIDE  AFQT  and  CLASP  A/D  components  seem  motivated  in 
conceptually  opposite  directions.  Two  possible  methods  forjudging  and  comparing 
them  are:  (a)  evaluate  them  in  context  with  the  remaining  model  components,  and  (b) 
evaluate  them  from  a  policymaker's  standpoint.  One  possible  technique  of 
accomplishing  (a)  is  to  apply  the  two  algorithms  to  a  baseline  set  of  applicant  records 
and,  applicant  by  applicant,  compare  the  RIDE  and  CLASP  optimal  lists.  In  particular, 
since  RIDE  requires  less  applicant  input  information,  this  technique  could  provide 
useful  insights  into  the  manner  in  which  the  SPSU  and  AFQT  components  interact  to 
generate  a  composite  RIDE  utility.  It  may  also  be  helpful  in  understanding  how  the 
School  Success  and  Aptitude/Difficulty  components  of  CLASP  interact,  and  how  the 
RIDE  composite  utilities  compare  with  a  composite  of  the  School  Success  and 
Aptitude/Difficulty  components  of  CLASP.  Evaluation  of  (b)  requires  a  policymaker  to 
express  opinions  on  questions  such  as:  Do  the  "under-qualification"  and  "over¬ 
qualification"  concepts  make  sense  in  the  Navy  environment?  In  particular,  should 
under-qualified  (or  over-qualified)  applicants  be  awarded  fewer  utility  points  if  their 
technical  aptitude  does  not  closely  match  the  technical  difficulty  level  of  a  given  rating, 
under  the  premise  that  too  low  (or  too  high)  an  aptitude  level  makes  them  less  likely  to 
succeed  in  that  rating? 

The  4  remaining  CLASP  components  that  do  not  have  counter-parts  in  RIDE  are 
discussed  in  the  following  sections. 
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Navy  Priority/ 1  ndividual  Preference  Component 

Kroeker  and  Rafacz  (1983)  provide  an  excellent  introduction  to  the 
Priority/Preference  component  and  description  of  the  priority  scale. 

The  purpose  of  this  component  is  to  incorporate  both  Navy  priorities  and 
individual  preferences  when  assigning  recruit  applicants  to  ratings.  These  two 
sets  of  objectives  maybe  incompatible,  particularly  if  both  are  described  by 
utility  functions  allowed  to  vary  independently.  For  example,  the  gain  in 
utility  resulting  from  an  applicant's  expression  of  strong  preference  for  a 
particular  rating  may  be  offset  by  a  loss  in  utility  if  the  rating  has  a  low  Navy 
priority. 

To  overcome  the  deficiency  of  a  strictly  additive  model,  an  interactive  utility 
function  was  designed.  Thus,  a  utility  value  is  obtained  as  a  function  of  the 
Navy  priority  index  for  a  particular  rating  in  conjunction  with  the  applicant's 
specified  preference  value  for  that  rating.  To  address  both  Navy  priority  and 
individual  preference,  two  scales  were  derived: 

Priority  Scale:  Navy  priorities  were  obtained  from  the  career  reenlistment 
objectives  listed  by  the  Office  of  the  Chief  of  Naval  Operations.  These 
priorities  were  augmented  and  modified  using  rating  popularity  and  rating 
size  as  variables  in  a  least  squares  regression  analysis.  The  resulting  priority 
scale  was  refined  by  data  collected  from  10  Navy  personnel  managers 
concerned  with  setting  recruiting  goals  and  “A”  School  priorities.  In  a 
procedure  similar  to  that  used  to  establish  the  job  complexity  scale,  these 
officers  compared  the  relative  importance  to  the  Navy  of  small  groups  of 
ratings,  by  pairs.  As  with  the  job  complexity  scale,  values  were  then  modified 
using  a  procedure  to  revise  estimates  of  psychological  scale  values  (Kroeker, 

1982). 

The  Kroeker  and  Rafacz  description  of  the  individual  preference  scale  is  not 
consistent  with  the  actual  CLASP  implementation.  Therefore,  the  following  alternative 
description  is  provided: 

An  individual  preference  value  is  computed  for  each  rating.  The  applicant 
classification  process  at  the  Military  Entrance  Processing  Station  (MEPS)  does  not  allow 
enough  time  for  the  recruit  to  rank  order  all  ratings  s/he  may  potentially  be  assigned  to. 
Therefore,  preference  values  are  not  determined  on  the  basis  of  individual  ratings. 
However,  since  each  rating  belongs  to  exactly  1  of  approximately  15  occupational  group 
categories,  preferences  are  determined  by  asking  the  applicant  to  rank  order  up  to  5 
occupational  groups  in  terms  of  preference.  Each  rating  in  the  most  preferred 
occupational  group  receives  the  highest  possible  preference  value  (too),  each  rating  in 
the  second  ranked  group  receives  the  second  highest  possible  preference  value  (90),  etc. 
Thus,  the  preference  scale  can  be  expressed  as 
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Zijfr]  =  100  -  io(r  -  l),  r  =  l,  2, np,  where 


Z ijfr]  is  the  individual  preference  value  for  individual  i  in  rating/[r], 

the  index  j[r]  ranges  over  all  ratings  in  the  rt h  ranked  occupational  category, 

r  is  the  occupation  group  ranking,  and 

1  <  rip  <  5  is  the  number  of  occupational  groups  the  applicant  expresses  a 
preference  for. 

For  ratings  in  the  remaining  occupational  groups  r  for  which  the  applicant  did  not 
express  a  preference,  is  assigned  the  lowest  possible  preference  value  (20).  Thus, 
the  individual  preference  scale  ranges  between  20  and  100,  with  larger  preference 
values  associated  with  the  applicant's  most  preferred  ratings  and  smaller  values 
associated  with  his/her  least  preferred  ratings. 

Given  the  Navy  priority  index  of  a  rating  and  the  individual's  preference  value  of  the 
rating,  the  unstandardized  Priority/Preference  utility  is  given  by  Equation 
Prior_Pref_Unstd : 

Upp{fVj,Zi  J)  =  90.0  +  (0.001)  Wj  +  (1.8)  ( Zjj  -  100)  -  (0.0000014)  W2  (Zij  -  100)2 
-  (0.00018)  W2  (Z,  y- 100)  +  (0.009)  (Zi  j  - 100)2 ,  where 
u  (w  z  ) 

pp\  j’  ‘j /  is  the  priority/preference  utility  associated  with  individual  i  in  rating 

j, 

W 

J  is  the  Navy  priority  index  value  for  rating/,  and 
Z 

lJ  is  the  individual  preference  value  for  individual  i  in  rating/. 

In  Figure  5,  Upp(Wj,Zij )  is  plotted  on  the  vertical  axis  against  Individual  Preference 
on  the  horizontal  axis,  for  priority  values  of  100,  80,  50,  and  o.  The  four  curves  are  non¬ 
intersecting  and  appear  in  order  of  increasing  priority  level  from  bottom  to  top.  Thus, 
for  any  fixed  individual  preference  value,  a  larger  priority  value  generates  a  larger 
priority/preference  utility  than  a  smaller  priority  value. 
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Figure  5.  Priority/  Preference  Utility. 

In  addition,  the  utility  for  each  priority  level  is  an  increasing  function  of  individual 
preference  level.  Thus,  the  utility  of  a  person-rating  match  increases  both  as  a  function 
of  the  rating  priority  (for  a  fixed  individual  preference  level)  and  as  a  function  of  the 
individual's  preference  for  the  rating  (for  a  fixed  priority  level).  However,  since  the 
curves  are  not  parallel,  utility  is  a  non-linear  function  of  priority  and  preference. 

The  uppermost  curve  represents  utility  values  corresponding  to  the  highest  level  of 
Navy  priority  (too)  across  the  entire  range  of  individual  preferences.  A  strong  or 
moderate  preference  for  a  high  priority  rating  yields  a  high  utility  value,  since  both  the 
Navy's  and  the  applicant's  interests  are  satisfied  by  such  an  assignment.  A  low 
preference  for  a  high  priority  rating  yields  a  moderate  level  utility  that  expresses  the 
importance  of  the  rating  to  the  Navy.  The  lowest  curve  represents  utility  values 
corresponding  to  the  lowest  Navy  priority  level  (o)  across  the  range  of  individual 
preferences.  A  strong  preference  for  a  low-priority  rating  produces  a  high  utility  because 
of  the  Navy's  attempt  to  honor  the  applicant's  preference.  A  moderate  degree  of 
preference  for  the  rating,  however,  results  in  a  relatively  low  utility  value  because  the 
Navy's  interests  are  not  served  by  such  an  assignment.  An  expression  of  no  preference 
for  a  low-priority  rating  results  in  the  lowest  possible  utility  level  because  neither  the 
Navy's  nor  the  applicant's  interests  are  satisfied. 

Equation  Prior_Pref_Unstd  was  developed  by  assuming  that  UPP(W,  Z)  is  a 
polynomial  in  Navy  priority  W  and  individual  preference  Z.  It  was  assumed  to  be  a 
second  degree  polynomial  in  both  W  and  Z,  and  thus  it  has  9  unknown  coefficients: 

urr(w,z)=flflcIJwl(z-m) 

i= 0  j= 0 
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The  initial  conditions  used  to  estimate  the  coefficients  were  similar  to  the  following. 
Their  reasonableness  can  be  verified  by  inspection  of  Figure  5: 

(1)  Upp(  100,0)  =  50. 

(2)  £7pp(ioo,ioo)  =  too. 

(3)  — —  =  o  when  evaluated  atZ  =  too,  W  =  too. 

dZ 

(4)  Upp( 0,100)  =  90. 

(5)  Upp( 0,0)  =  o. 

(6)  — —  =  o  when  evaluated  atZ  =  o,  W=  o. 

dZ 

(7)  Z7pp(- 80,100)  =  96.4 

(8)  C7pp(8o,o)  =  32. 

d2U 

(9)  — -  =  o  when  evaluated  at  W=  80. 

dz2 

Condition  (9)  states  that  £7( 80,  Z)  is  a  linear  function  of  Z,  and  so  its  second  partial 
derivative  with  respect  to  Z  should  equal  zero  when  evaluated  at  W  =  80.  The 
standardized  Priority/Preference  payoff  is  obtained  from  the  equation: 

U*PP (fVj , Z; )  =  50  + 10  (  Upp^VJ,Zi^  Mpp  ),  where 

C  pp 


is  the  standardized  priority/preference  payoff  associated  with 
individual  i  and  rating  j, 

UPMj,Z,)  is  the  unstandardized  priority/preference  utility  for  individual  i  and 
rating  j,  and 

jupp  and  a pP  are  the  mean  and  standard  deviation,  respectively,  of  Upp  (fVj ,  Z; ) 
scores  in  the  reference  population. 


CLASP  Parameter  Update  Considerations 

The  Navy  priority  indices,  fipp,  and  app  for  each  rating  are  the  three 
Priority/Preference  component  parameters  that  require  updates.  The  CLASP  parameter 
update  software  automatically  generates  updates  for  /upp  and  opp  during  the  annual 
CLASP  parameter  update.  However,  no  known  documentation  describing  procedures, 
methodology,  or  software  for  updating  the  Navy  priority  indices  exists,  other  than  the 
summary  description  given  in  Kroeker  and  Rafacz  and  repeated  above.  The  priority 
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indices  were  never  updated  during  the  author’s  association  with  CLASP  between  1980 
and  1999.  In  the  absence  of  detailed  information  to  supplement  the  summary 
description,  it  is  not  feasible  to  perform  future  Navy  priority  index  updates. 


Minority- Fill  Component 


Prior  to  CLASP,  minority  group  members  were  assigned  in  disproportionately  large 
numbers  to  a  few  ratings  and  in  small  numbers  to  many  others.  The  minority-fill 
component  was  designed  to  provide  a  uniform  assignment  of  minority  group  members 
for  each  rating.  A  uniform  rate  of  non-minority  assignments  is  also  implied.  The  goal 
was  for  the  proportion  of  minority  group  members  in  any  rating  to  always  equal  the 
previously  specified  minority  proportion  goal  for  the  rating. 

Kroeker's  methodology  for  determining  minority  proportion  goals  during  his  tenure 
on  the  CLASP  project  is  largely  undocumented.  However,  each  goal  was  apparently 
constructed  to  compensate  for  historical  minority  fill  trends.  If  historical  minority  fill 
rates  for  a  given  rating  were  less  than  historical  minority  fill  rates  across  all  ratings  (e.g., 
Navy-wide  minority  fill  rates),  then  a  minority  fill  goal  larger  than  the  historical  average 
was  specified.  Conversely,  if  the  historical  fill  rate  for  a  given  rating  was  greater  than  the 
average  historical  rate  across  all  ratings,  then  a  minority  fill  goal  smaller  than  the 
historical  average  was  specified.  Beginning  with  the  2001  CLASP  parameter  update, 
NPRST  began  using  a  common  (Navy-wide)  minority  goal  for  all  ratings. 

Differences  between  the  actual  and  desired  minority  group  proportions  at  any  given 
time  in  the  reservation  cycle  indicate  the  current  status  of  the  uniform  fill-rate  objective 
function.  The  function  compensates  for  current  conditions  by  (1)  adding  utility  points 
for  minority  group  members  and  subtracting  utility  points  for  non-minority  group 
members  when  the  current  proportion  of  minority  group  members  is  less  than  the 
minority  goal,  and  (2)  subtracting  utility  points  for  minority  group  members  and  adding 
utility  points  for  non-minority  group  members  when  the  current  proportion  of  minority 
group  members  is  greater  than  the  minority  goal.  The  equation  defining  the  feedback 
function  is  given  by 

Mij  =  ( Gj  -  Fj,t)lM/NM,  where: 

Mij  is  the  minority  fill  difference  associated  with  assigning  individual  i  to  rating^  at 
time  t, 

Gj  is  the  desired  minority-fill  goal  for  rating  /, 

Fj,t  is  the  actual  minority  fill  proportion  for  rating^  at  time  t  (i.e.,  the  ratio  of  the 
number  of  minority  accessions  in  rating^  to  the  total  number  of  accessions  in 
rating/),  and 

Im/nm  is  a  variable  whose  value  is  1  if  the  individual  being  classified  at  time  t  is  a 
minority  group  member  and  is  -1  if  the  individual  is  a  non-minority  group 
member. 
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The  standardized  minority-fill  payoff  is  computed  according  to  the  equation  below. 
The  quantity  of  utility  points  added  or  subtracted  is  proportional  to  the  difference 
between  the  actual  and  desired  fill  proportions: 


f 

UMF(i,j,t)  =  50  +  10 

V 


®  MF  ) 


,  where 


UMF(i,j,t)  is  the  standardized  minority  fill  payoff  for  individual  i  being  classified  at 
time  t  with  respect  to  rating  /, 

MUJ  is  defined  in  the  previous  equation,  and 


aUF  is  the  standard  deviation  of  Mi  j  differences  in  the  reference  population. 

The  above  equation  represents  the  minority  fill  payoff  function  used  in  the  CLASP 
simulation  model  and  the  operational  CLASP  model.  Note  that  there  is  a  difference  in 
the  denominators  of  this  equation  and  the  corresponding  equation  in  Kroeker  and 
Rafacz  (1983);  the  reason  for  this  difference  is  unknown. 


Minority  Fill  Parameter  Update  Considerations 

The  only  Minority  Fill  component  parameter  subject  to  updating  is  cfmf .  The  CLASP 
parameter  update  software  automatically  generates  an  update  for  omf  during  the  annual 
CLASP  parameter  update. 


Fraction  Fill  Component 


Prior  to  CLASP,  the  end  of  each  recruiting  month  was  typically  marked  by  a  flurry  of 
recruiting  activity  aimed  at  filling  a  substantial  number  of  positions  in  certain  ratings. 
From  a  managerial  perspective,  a  procedure  resulting  in  a  uniform  rate  of  assignment 
across  all  ratings  is  highly  desirable.  The  fraction  fill  component  was  designed  to 
compare  the  proportion  of  applicants  assigned  to  a  particular  rating  with  the  average 
proportion  of  applicants  assigned  to  all  ratings  at  the  time.  If  the  fill  proportion  for  the 
rating  in  question  is  less  than  the  average  fill  proportion,  additional  utility  points  are 
awarded  to  influence  the  applicant  to  select  the  rating.  If  selected,  the  rating  fill 
proportion  moves  closer  to  average  fill  rate.  Similarly,  utility  points  are  subtracted  when 
the  proportion  of  the  recruiting  goal  that  has  been  filled  in  a  given  rating  exceeds  the 
average  fill  proportion.  If  the  applicant  selects  a  different  rating,  the  resulting  average 
fill  rate  increases  slightly,  thereby  moving  closer  to  the  rating  fill  proportion.  The 
operational  part  of  the  fraction-fill  utility  function  is  given  by: 

Tj,t  =  Bt-  Fj,t,  where 

Tj,t  is  the  difference  in  proportions  for  ratings  when  individual  i  is  classified  at  time 
t, 

Bt  is  the  average  fill  proportion  across  all  ratings  at  time  t,  and 
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Fj,t  is  the  proportion  of  applicants  that  have  been  assigned  to  openings  with  rating  j 
up  to  time  t. 

The  standardized  fraction  fill  payoff  is  calculated  as: 


f 

UFF(i,j,t)  =  50  +  10 

V 


®  FF  y 


,  where 


a FF  is  the  standard  deviation  of  Tj  t  differences  in  the  reference  population. 


Fraction  Fill  Parameter  Update  Considerations 

The  only  Fraction  Fill  component  parameter  subject  to  updating  is  off.  The  CLASP 
parameter  update  software  automatically  generates  an  update  for  off  during  the  annual 
CLASP  parameter  update. 


Attrition  Component 


One  apparent  motive  for  adding  the  Attrition  Component  to  the  CLASP  model  was  to 
incorporate  additional  non-ASVAB  information  into  the  classification  process.  The 
"Attrition"  concept  has  a  broader  definition  in  context  of  the  Attrition  Component  than 
it  does  in  the  context  of  the  School  Success  component.  In  School  Success,  attrition  is 
defined  solely  in  terms  of  “A”  School  attrition,  while  in  the  context  of  the  Attrition 
Component;  it  is  defined  in  terms  of  Navy-wide  attrition.  The  person  attribute  is  the 
Success  Chances  of  Recruits  Entering  the  Navy  (SCREEN)  score,  which  is  based  upon 
AFQT,  education  credential  status,  and  age.  Thus,  the  Attrition  component  incorporates 
non-ASVAB  information  about  the  applicant's  education  credential  status  and  his/her 
age,  and  information  concerning  attrition  in  the  rating  from  sources  other  than  “A” 
School  into  the  classification  process. 

Like  the  Aptitude/Complexity  and  Navy  Priority/Personnel  Preference  Components, 
the  Attrition  component  uses  an  individual  characteristic  measure  and  a  rating 
characteristic  measure  to  evaluate  utility.  The  Attrition  Component  evaluates  the  utility 
of  assigning  a  given  individual  to  a  given  rating,  based  upon  the  probability  of  surviving 
the  first  term  of  enlistment  and  the  attrition  severity  index  (ASI)  of  the  rating.  The 
person  characteristic  measure  is  the  SCREEN  table  (Lockman,  1977)  and  the  rating 
characteristic  measure  is  the  ASI.  The  SCREEN  score,  which  is  based  upon  the 
individual's  education  credential  status,  AFQT  score,  and  age,  reflects  the  probability  of 
successfully  completing  the  first  term  of  his  enlistment.  The  ASI  was  developed  using  5 
factors:  retention  rate,  personnel  replacement  costs,  rating  size  (number  of  personnel  in 
the  rating),  rating  requirements  (need  for  trained  personnel  in  the  rating),  and  priority 
(relative  importance  of  the  rating)  to  the  Navy.  A  multiplicative,  multi-attribute  model 
was  then  used  to  calculate  the  ASI  from  the  5  factors  (Thomas,  Elster,  Euske,  &  Griffin, 
1984). 
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The  following  discusses  the  construction  of  the  attrition  component  policy  function 
describing  the  utility  of  assigning  an  individual  to  a  rating  on  the  basis  of  the 
individual's  attrition  risk  and  the  attrition  severity  characteristics  of  the  ratings.  The 
SCREEN  table  is  constructed  so  that  an  individual  is  a  low  (high)  risk  to  attrite  during 
the  first  term  of  his  enlistment  if  his  SCREEN  score  is  high  (low).  Accordingly,  for 
purposes  of  deriving  the  attrition  policy  function,  the  low  attrition  risk  individual  is 
defined  as  having  a  SCREEN  of  96,  while  the  high  attrition  risk  individual  is  defined  as 
having  a  SCREEN  of  70.  The  ASI  scale  is  constructed  so  that  a  rating  is  characterized  by 
high  (low)  attrition  severity  if  its  ASI  is  large  (small).  Accordingly,  a  rating  with  a  low 
attrition  severity  problem  is  defined  as  having  an  ASI  of  10,  while  a  rating  with  a  high 
attrition  severity  problem  is  defined  as  having  an  ASI  equal  to  80. 

SCREEN  rank-orders  the  applicant  population  and  the  ASI  scale  rank-orders  the 
ratings,  thus  the  assignment  of  a  low-risk  applicant  (high  SCREEN)  to  a  rating  with  a 
large  ASI  is  a  desirable  outcome  and  should  receive  high  utility.  In  fact,  the  policy 
function  was  constructed  so  that  this  assignment  received  the  largest  possible  value 
(100).  Although  the  low- risk  applicant  is  also  a  low  risk  to  attrite  from  a  low  ASI  rating, 
it  is  more  sensible  from  a  classification  policy  standpoint  to  assign  this  applicant  to  the 
high  ASI  ratings,  and  fill  the  low  ASI  ratings  with  individuals  characterized  by  a  slightly 
larger  risk  to  attrite.  Accordingly,  the  assignment  of  the  low-risk  applicant  to  the  low- 
risk  rating  received  an  intermediate  value  of  60.  The  assignment  of  a  high-risk  applicant 
to  a  low-risk  rating  received  a  value  of  55,  slightly  less  than  the  value  of  the  assignment 
of  the  low-risk  applicant  to  the  low-risk  rating.  Finally,  the  assignment  of  a  high-risk 
applicant  to  a  high  ASI  rating  results  in  the  largest  possible  risk  that  the  applicant  will 
attrite.  Accordingly,  this  undesirable  outcome  received  the  lowest  possible  value  (o). 
Substitution  of  theses  four  functional  specifications  yields  four  linear  equations  in  four 
unknown  coefficients:  C0,o,  C0,i,  Ci)0,  and  Ci,i. 


UMS,  V)  =  Co,0  +  ClfO  (S  -  70)  +  Co ,1 


(V  -  80)  +  Cl,!  (S  -  70)  (V  -  80) ,  where 


UAtr{S,  V)  =  non-standardized  attrition  component  utility  of  assigning  person  i  to 
job  option;', 

S  =  applicant's  SCREEN  score, 

V  =  attrition  severity  index 

Solution  of  the  4  equations  yields  these  estimates:  C0,o  =  0.0,  Ci,0  =  3.846, 

C0,i  =  -0.7857,  and  Cm  =  0.0522. 


In  Figure  6,  the  non-standardized  attrition  utility  UaiAS,  V)  is  plotted  on  the  vertical 
axis  against  SCREEN  on  the  horizontal  axis,  for  fixed  ASI  values  of  10,  45,  and  80.  The 
standardized  Attrition  payoff  is  obtained  from  the  equation: 


VL,M,)=  50  +  10 


uJVj.s,)- 


Atr 


t^Atr 

J 


where 


U*Atr  {Vj  ,St)  =  standardized  attrition  component  payoff  associated  with  individual  i 
and  rating;, 
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UjVj.S,)  =  non-standardized  attrition  component  payoff  for  individual  i  and 
rating  and 

juAtr  and  <jAtr  are  the  mean  and  standard  deviation,  respectively,  of  U  Atr  (jA ,  S, ) 
scores  in  the  reference  population. 


SCREEN 


- ASI  =  10 

—  —  ASI  =  45 
—A— ASI  =  80 


Figure  6.  Attrition  Utility  at  constant  severity  values. 

CLASP  Parameter  Update  Considerations 

The  attrition  severity  index  (ASI)  parameters  for  each  rating  along  with  fiAtr  and  oatr 
constitute  the  attrition  component  parameters  subject  to  updating.  The  CLASP 
parameter  update  software  automatically  generates  updates  for  fiAtr  and  OAtr  during  the 
annual  CLASP  parameter  update.  Thomas,  Elster,  Euske,  and  Griffin  (1984)  document 
the  procedures  and  methodology  they  used  to  develop  the  original  set  of  ASI  parameters 
in  the  early  1980s.  However,  the  ASI  parameters  have  not  been  updated  since  their  1983 
implementation.  In  the  absence  of  (a)  detailed  information  to  supplement  the  Thomas 
et  al.  report,  (b)  knowledge  of  and  access  to  all  relevant  attrition,  replacement  cost,  and 
demand  for  personnel  information,  and  (c)  software  to  calculate  the  updates,  it  is  not 
feasible  to  perform  future  ASI  parameter  updates. 
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CLASP  Component  Weights  and  Composite  Payoff 


As  previously  described,  a  payoff  vector  for  each  CLASP  component  is  calculated,  the 
jth  entry  of  which  is  the  standardized  payoff  of  assigning  a  given  individual  to  thejth 
rating.  For  each  rating,  the  weighted  sum  of  the  six  components  represents  the 
composite  (overall)  utility  associated  with  assigning  the  individual  to  that  rating.  The 
weighted  sum  for  each  rating,  hereafter  called  the  "composite  payoff"  is  given  by: 

6  6 

U,  j  =  X  wkU *j ,k  where  ^  wk  =  1 00  Composite  Payoff 

k= 1  k= 1 


where  U:  J  is  the  composite  payoff  for  the  2th  individual  with  respect  to  job  option  j, 

U*i  j  k  is  the  standardized  component  k  payoff  for  the  2th  individual  with  respect  to 
job  option  j,  and  Wk  is  the  weight  associated  with  component  k. 

The  component  weights  were  determined  by  Navy  classification  policy.  Each  weight 
expresses,  in  some  sense,  the  policymaker's  desired  "contribution"  of  each  component  to 
the  composite.  In  practice,  however,  "contribution"  is  difficult  to  define  mathematically. 
Correlations  among  the  6  components  make  it  difficult  to  state  an  exact  relationship 
between  the  component  weight  and  the  proportion  of  variance  that  the  component 
contributes  to  the  composite  payoff.  However,  standardization  of  the  composite  payoffs 
allows  CLASP  to  partially  control  each  component's  contribution  to  the  composite 
variance.  As  a  result,  each  component  weight  provides  a  reasonable  approximation  to 
the  policymaker's  desired  contribution  of  each  component. 

As  described  by  Kroeker  and  Rafacz  (1983),  the  component  weights  were  derived 
according  to  the  following  criteria:  The  raw  utility  scores  for  the  school  success  and 
aptitude/complexity  components  were  examined.  It  was  observed  that  the  variance  of 
the  aptitude/complexity  scores  was  affected  by  a  number  of  extreme  values.  For  the 
center  of  the  scale  to  function  effectively  in  discriminating  between  persons,  it  was 
decided  that  the  variance  of  the  weighted  aptitude/complexity  component  should  be 
allowed  to  assume  a  larger  value  than  that  of  the  weighted  school  success  component, 
but  by  no  more  than  a  ratio  of  3:2.  Respective  weights  of  26  and  35  for  the  school 
success  and  aptitude/complexity  components  satisfied  this  criterion.  The  second 
criterion  stipulated  that  the  priority/preference  component  should  carry  approximately 
the  same  weight  (14)  as  the  combined  minority  and  fraction  file  component  weights  (15). 
The  minority-fill  component  was  given  a  slightly  larger  weight  than  the  fraction-fill 
component,  resulting  in  weights  of  8  and  7  respectively.  The  attrition  component  weight 
(10)  was  assigned  according  to  the  requirement  that  it  not  exceed  the  individual  weights 
of  the  school  success,  aptitude/difficulty,  and  priority/preference  components. 
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Table  2 

Component  weights 


Component 

Weight 

School  Success 

26 

Aptitude/Complexity 

35 

Priority/Preference 

14 

Minority-Fill 

8 

Fraction-Fill 

7 

Attrition 

10 

CLASP  Decision  I  ndices  and  Optimality  I  ndicators 

This  section  describes  computation  of  applicants’  decision  index  (DI)  and  optimality 
indicator  (01)  distributions  from  their  composite  payoff  vector.  CLASP  computes  the 
decision  index  for  each  rating  as  the  difference  between  the  composite  payoff  and  the 
corresponding  decision  index  mean: 

A t  j  =  UtJ  -  U  ■  Decision_Index 

where  Aij  is  the  DI  for  the  zth  individual  with  respect  to  job  option  /, 

Uij  is  the  composite  payoff  for  the  zth  individual  with  respect  to  job  option  j,  and 
Uj  is  the  DIM  for  job  option  j. 

As  previously  described,  CLASP  attempts  to  force  the  classifier  and  applicant  to 
select  a  job  option  close  to  the  top  of  the  optimal  list  and  makes  it  more  difficult  to  select 
an  option  near  the  bottom.  However,  the  joint  distribution  of  the  vector  of  composite 
utility  functions  (across  all  job  options)  maybe  such  that  certain  job  options  make 
infrequent  appearances  near  the  top  of  the  optimal  list  and,  consequently,  classifiers 
cannot  access  them  frequently  enough  to  satisfy  recruiting  goals.  Such  a  scenario  may 
occur,  for  instance,  when  the  quota  is  large  and  the  expected  value  of  the  composite 
utility  function  for  that  job  option  is  small,  relative  to  the  other  job  options.  To 
compensate,  a  decision  index  mean  (DIM)  for  each  job  option  is  subtracted  from  the 
applicant's  composite  utility  score  for  than  job.  Each  DIM  is  the  mean  of  the  composite 
payoff  distribution  for  that  job  option  with  respect  to  the  applicant  population  (Ward, 
1958).  For  each  job,  the  expected  value  of  the  difference  between  the  composite  utility 
and  the  DIM  is  zero.  This  adjustment  insures  that,  over  the  long  run,  each  job  option  is 
as  likely  to  appear  near  the  top  of  the  optimal  list  as  it  is  to  appear  near  the  bottom. 
Analysis  of  historical  CLASP  transaction  data  has  demonstrated  that  this  adjustment  is 
usually  adequate  to  insure  that  sufficient  CLASP  presentations  are  generated  to  allow 
classifiers  to  cover  the  quota  for  each  job. 

The  decision  indices  are  transformed  onto  a  scale  ranging  from  o  to  too  for 
presentation  to  the  classifier  and  applicant.  This  is  accomplished  in  two  stages.  Stage  1 
transforms  the  individual's  decision  index  distribution  onto  a  first-stage  01  scale  having 
a  mean  of  50  and  standard  deviation  of  20.  CLASP  performs  this  transformation  using  a 
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weighted  mean  and  standard  deviation  of  the  individual's  DI  distribution,  where  each 
weight  is  the  current  number  of  available  openings  (i.e.,  quota  minus  reservations-to- 
date)  for  the  rating  and  ship  month.  (Note:  Equation  [16]  in  Kroeker  and  Rafacz  [1984] 
is  not  consistent  with  the  Stage  1  transformation  performed  in  CLASP.  The  operational 
CLASP  implementation  includes  a  [weighted]  mean,  while  equation  [16)  does  not.)  The 
Stage  1  01  list  is  then  sorted  in  descending  order.  OIs  on  the  sorted  list  are  then 
translated  and  truncated  to  generate  the  Stage  2  list.  The  combined  translation  and 
truncation  operations  give  the  highest-rated  rating  on  the  Stage  2  scale  an  01  of  too  and 
insure  that  none  of  the  OIs  at  the  bottom  of  the  list  are  less  than  zero.  Equation  (17)  of 
Kroeker  and  Rafacz  (1984)  describes  the  translation.  After  translation,  each  negative  01 
on  the  list  is  set  equal  to  zero. 

Rl  DE  Composite  Payoff 

The  RIDE  composite  utility  for  individual  i  and  job  option  j  is 

c  =w  SIU +w  o 

^  ij  v v  SPSU  u ij  ^  VY  AFQT  ^ ij 

where  Wspsu  =  Wafqt  =  1/2  are  the  respective  SPSU  and  AFQT  component  weights, 

Cij  is  the  composite  RIDE  utility  for  individual  i  and  job  option  /,  and 

S-j  and  Qij  are  the  SPSU  (Stage  III)  and  AFQT  utility  scores  for  individual  i  and 
job  option  /. 


Discussion 


This  section  discusses  certain  issues  raised  during  the  course  of  the  CLASP-RIDE 
comparison  that  may  be  relevant  to  classification  policymakers.  These  issues  include  (1) 
standardization  of  RIDE  components,  (2)  incorporation  of  factors  into  the  classification 
decision  that  are  excluded  from  the  CLASP  and  RIDE  algorithms,  including  non- 
psychological/psychometric  variables,  and  certain  dynamic  and  time-critical  factors,  (3) 
the  PDR  concept  and  parameterization  of  RIDE,  and  (4)  discussion  of  Bin  model  vs. 
LRM  results. 

CLASP  standardizes  each  of  its  6  components  so  that  each  has  a  mean  of  50  and 
standard  deviation  of  10.  As  previously  described,  CLASP  policymakers  apparently  felt  it 
was  important  to  apply  appropriate  nominal  weights  to  each  component  and 
standardize  the  component  score  distributions  in  such  a  manner  that  the  effective 
component  weights  closely  approximate  the  nominal  weights.  If  classification 
policymakers  are  also  concerned  about  consistency  between  the  nominal  and  effective 
weights  of  the  SPSU  and  AFQT  components,  they  should  recognize  that  the  nominal 
weights  for  the  SPSU  and  AFQT  components  (currently  50%  for  each)  are  probably  not 
the  same  as  the  effective  weights.  The  actual  weight  of  each  RIDE  component  is 
determined  by  the  product  of  the  nominal  weight  and  the  standard  deviation  of  the 
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component.  Hence,  the  component  with  the  largest  standard  deviation  has  the  largest 
effective  weight.  If  classification  policymakers  wish  to  specify  the  actual  weight  of  each 
component,  each  component  must  be  standardized  (by  dividing  by  its  standard 
deviation)  before  calculation  of  the  RIDE  composite  payoff. 

In  the  current  CLASP  implementation,  classifiers  must  attempt  to  sell  each  applicant 
an  option  from  the  Top  15  prior  to  viewing  the  CLASP  optimal  list.  However,  before 
introduction  of  the  Top  15  feature,  the  CLASP  optimal  list  presentation  strategy 
indicates  there  was  considerable  emphasis  on  placing  each  applicant  into  an  "optimal" 
job  assignment.  This  emphasis  was  manifested  in  the  manner  in  which  the  optimality 
output  was  used  in  the  classification  interview,  particularly  in  the  optimal  list 
presentation  strategy  and  the  classifier's  role  in  selling  an  option  on  the  list.  This 
strategy  forced  the  classifier  and  applicant  to  view  the  CLASP  optimal  list  in  groups  of  5, 
10,  or  15  job  options  at  a  time,  depending  upon  the  applicant's  projected  enlistment 
date.  The  first  group  of  options  consisted  of  those  jobs  with  the  highest  optimality 
scores.  In  theory,  the  classifier's  role  was  to  convince  the  applicant  to  buy  an  option  on 
this  list  because  they  were  considered  the  best  possible  matches.  If  the  classifier  could 
not  sell  one  of  these  options,  he  would  try  to  sell  an  option  from  the  job  group  with  the 
next  largest  set  of  optimality  scores.  In  theory,  the  classifier  would  continue  working 
down  the  optimal  list  until  a  group  containing  a  mutually  satisfactory  option  was  found. 
Although  it  was  possible  for  classifiers  to  access  and  sell  options  near  the  bottom  of  the 
CLASP  optimal  list,  it  was  more  difficult  and  time  consuming  for  them  to  do  so. 

CLASP  was  developed  during  a  period  when  several  papers  in  the 
Industrial/Organizational  psychology  literature  touted  the  potential  benefits  of 
automating  empirical  models  describing  the  utility  of  matching  applicants  to  jobs 
(Dunnette  &  Borman,  1979).  An  attitude  prevailed  that  most  or  all  factors  considered 
during  personnel  classification  decisions  could  and  should  be  implemented  on  the 
computer.  In  apparent  accordance  with  this  point  of  view,  CLASP  was  sold  to 
Commander,  Navy  Recruiting  Command  (CNRC)  under  the  philosophy  that  a 
computerized  classification  algorithm  could  rank  order  job  options  by  their  mutual 
benefit  to  both  the  Navy  and  applicant.  The  presentation  strategy  described  above 
clearly  promotes  CLASP's  definition  of  optimality  by  reinforcing  the  classifier  to  select 
from  the  top  of  the  list. 

Enlisted  recruit  classification  occurs  in  an  environment  that  places  a  strong 
emphasis  on  filling  quotas  and  meeting  recruiting  requirements  and  objectives.  The 
classification  algorithm  must  operate  in  a  manner  consistent  with  the  attainment  of 
these  goals.  These  goals  originate  outside  of  CNRC.  Some  Navy  ratings  have  large 
recruiting  goals,  due  to  large  manpower  requirements,  while  other  ratings  have 
comparatively  small  goals.  Some  ratings  are  popular  and  comparatively  easy  to  sell, 
while  others  are  less  popular  and  more  difficult  to  sell.  Changes  in  recruiting  goals  and 
shifting  of  quotas  among  different  recruiting  cycles  occur  frequently.  Changes  in  the 
Navy's  perception  of  which  recruiting  goals  are  critical  in  nature  often  occur.  The  events 
that  precipitate  changing  recruiting  goals  and  changing  criticality  designations  may  be 
difficult  to  forecast  in  advance.  Hence,  the  dynamic  nature  of  the  operational  Navy 
environment  means  that  designations  of  which  jobs  are  considered  critical,  their  relative 
degrees  of  criticality,  and  recruiting  goals  can  change  suddenly  and  without  warning. 
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CLASP  is  unable  to  fill  job  quotas  evenly  and  re-channel  applicants  into  critical  jobs 
without  substantial  classifier  intervention.  Although  the  purpose  of  the  Fraction  Fill 
component  is  to  fill  quotas  evenly  across  job  options,  it  has  minimal  impact  on  achieving 
this  because  it  carries  only  7  percent  of  the  weight  in  the  CLASP  optimality  composite. 
CLASP  cannot  re-channel  applicants  into  critical  jobs  because  it  cannot  differentiate 
between  jobs  on  the  basis  of  their  criticality.  Instead,  its  definition  of  optimality  focuses 
on  the  psychological  measures  of  goodness  of  fit  between  person  and  job.  The  inputs  to 
these  functions  are  those  person  and  job  characteristics  that  are  relatively  stable, 
permanent,  and  enduring  in  nature.  These  include  factors  such  as  intelligence,  job- 
specific  aptitude,  and  job  technical  complexity.  In  short,  it  is  not  possible  for  CLASP  or 
RIDE  to  adjust  for  all  factors  that  should  be  included  in  the  classification  process. 
Classification  algorithms  such  as  RIDE  and  CLASP  can  optimize  their  assignment 
recommendations  based  only  on  the  more  permanent  and  enduring  characteristics  of 
person  and  job,  in  particular,  the  psychological/psychometric  variables  they  currently 
use.  Classifiers  must  override  classification  algorithm  recommendations  if  they  desire  to 
incorporate  the  more  dynamic  and  time-critical  factors  into  the  classification  decision. 
Therefore,  classification  algorithms  and  their  optimal  list  presentation  strategies  must 
give  the  classifier  a  convenient  way  to  sell  any  job  currently  experiencing  a  critical  need, 
regardless  of  that  job's  ranking  on  the  classification  algorithm's  optimal  list.  In  CLASP, 
the  use  of  the  decision  index  to  rank-order  the  job  options  (instead  of  the  composite 
payoff)  has  helped  insure  that  all  ratings  are  reasonably  accessible  to  classifier  and 
applicant,  even  when  the  classifier  was  expected  to  sell  from  the  top  of  the  optimal  list. 

In  contrast,  RIDE  does  not  employ  the  DIM  concept.  It  rank-orders  job  options  on 
the  basis  of  RIDE  composite  utility.  This  may  be  entirely  valid.  RIDE  is  being  developed 
under  different  user  expectations  than  CLASP  was.  Unlike  CLASP,  RIDE  is  being 
implemented  on  modern  hardware.  The  DIM  concept  may  be  completely  unnecessary  in 
RIDE  if  user  expectations  and  hardware  capabilities  are  such  that  RIDE  can  provide 
adequate  accessibility  to  all  ratings,  regardless  of  optimality  value. 

Parameterization  of  Rl  DE  Model 

One  attractive  feature  of  CLASP  is  that  “A”  School  student  performance  data  is  not 
required  to  parameterize  the  model.  The  same  is  not  true  for  RIDE.  Student 
performance  data  for  each  RIDE  job  option  is  required  to  both  find  the  PDR  and 
estimate  the  FPPS  rates  in  the  cut  score  and  PDR  bins.  However,  in  the  Bin  Model 
Evaluation  section,  it  was  demonstrated  that  the  PDR  concept  does  not  stand  up  to 
rigorous  statistical  testing,  except  in  a  small  number  of  ratings. 

Given  the  following  problems  associated  with  the  use  of  school  performance  data  to 
parameterize  RIDE,  it  is  reasonable  to  ask  whether  the  RIDE  model  concept  should  be 
modified  to  eliminate  the  need  for  school  performance  data  from  the  parameter  update 
process.  These  include: 


37 


•  Weak  empirical  support  for  the  current  PDR  concept 

•  Inexperience  and  uncertainty  with  respect  to  the  FPPS  criterion  and  data  sources 

•  School  performance  data  is  not  necessary  to  parameterize  RIDE,  except  to  update 
the  PDRs 

•  Collection  of  school  performance  data  for  PDR  update  purposes  would  require 
time,  money,  and  effort  far  beyond  that  required  to  collect  only  applicant  data  for 
the  same  purpose 

•  In  some  cases,  ASVAB  selector  composite  and/or  cut  score  changes  cannot  be 
implemented  immediately  in  RIDE,  due  to  the  inappropriateness  of  using  an 
outdated  validation  sample  to  update  a  PDR  parameter 

The  design  of  the  SPSU  and  AFQT  components  depends  heavily  on  the  validity  of  the 
PDR  concept.  If  one  considers  the  current  PDR  concept  to  be  invalid,  but  still  believes 
the  SPSU  and  AFQT  components  to  be  valid  mathematical  models  of  the  goodness-of-fit 
between  person  and  job,  then  an  alternative  PDR  concept  is  needed  and  an  alternative 
procedure  is  needed  to  estimate  the  PDRs.  The  alternative  procedure  must  not  depend 
on  a  hypothesized  empirical  relationship  between  FPPS  and  student  aptitude.  In 
addition,  the  procedure  should  be  constructed  so  that  parameters  can  be  estimated  from 
Navy  applicant  data  only.  School  performance  data  should  not  be  required  to  estimate 
the  parameters. 

In  the  author's  opinion,  the  RIDE  algorithm  can  be  justified  as  a  classifier  decision 
process  model  and  the  PDR  can  be  justified  as  an  important  parameter  in  that  model. 

An  estimation  procedure  satisfying  these  requirements  can  then  be  derived  from  the 
concept  of  RIDE  as  a  classifier  decision  model.  Suppose  a  classifier,  without  assistance 
from  an  automated  classification  algorithm  such  as  CLASP  or  RIDE,  must  classify  an 
applicant.  Suppose  the  applicant  satisfies  the  cut  score  in  each  option  being  considered. 
Suppose  the  classifier  knows  (a)  the  applicant's  composite  and  AFQT  scores,  (b)  cut 
scores  for  all  composites,  and  (c)  all  composite  score  distributions  relative  to  the 
applicant  population.  A  simple  mathematical  model  can  be  developed  to  classify  the 
applicant  based  on  the  given  information.  The  model  is  based  on  the  assumption  that 
the  classifier  uses  reference  points  on  the  composite  and  AFQT  score  distributions  as 
rules-of-thumb  for  determining  which  job  option  the  applicant  is  best  suited  for. 

In  this  model,  the  classifier  uses  one  such  reference  point  in  the  same  manner  as  a 
PDR,  that  is,  to  designate  a  decision  cut-off  point  which  he  may  use  to  determine 
whether  his  applicant  is  marginally  qualified,  maximally  qualified,  or  over-qualified  for 
a  given  job.  If  the  composite  score  exceeds  the  PDR,  then  the  classifier  may  consider  the 
applicant  as  either  over-qualified  for  the  job  under  consideration  (and  thus  a  potential 
candidate  for  a  more  difficult  job)  or  maximally  qualified  (and  thus  a  solid  candidate  for 
the  job  under  consideration).  If  the  composite  score  is  less  than  the  PDR,  then  the 
classifier  may  consider  the  applicant  as  marginally  qualified  for  the  job  and,  therefore,  a 
potential  candidate  for  a  less  difficult  job.  As  previously  described,  the  AFQT  component 
decides  whether  the  applicant  is  over-qualified  or  maximally  qualified.  The  decision 
currently  depends  upon  where  the  applicant's  AFQT  score  stands  in  relation  to  the  over¬ 
qualification  point  (M  +  Delta)  on  the  AFQT  distribution  for  that  job  option. 
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Future  research  is  required  to  further  analyze  and  fill  in  the  details  of  this  proposed 
modification.  Unanswered  issues  remain  concerning  the  size  of  the  marginally-qualified, 
maximally-qualified,  and  over- qualified  regions.  Should  they  all  be  about  the  same  size, 
in  terms  of  the  proportions  of  the  applicant  population  residing  in  each  region? 
Depending  upon  the  answer,  consideration  should  be  given  to  modifying  the  AFQT 
component's  current  decision  rule. 

As  this  report  is  being  finalized,  it  is  uncertain  whether  “A”  School  performance  data 
will  be  available  and  whether  its  use  will  be  feasible  for  input  to  the  RIDE  parameter 
update  process.  This  report  has  also  raised  questions  concerning  the  lack  of  empirical 
support  for  the  PDR  concept  and  the  Navy’s  lack  of  experience  with  and  understanding 
of  FPPS.  If  school  performance  data  is  either  unavailable  or  infeasible  for  use,  then 
questions  regarding  the  appropriateness  of  the  Bin  model  to  estimate  FPPS  are 
irrelevant.  As  previously  discussed,  it  will  be  necessary  to  reformulate  the  RIDE  model 
in  terms  of  some  underlying  concept  other  than  PDR  as  an  indicator  of  student  over- 
qualification. 

However,  if  it  is  determined  that  school  performance  data  is  available  and  feasible 
for  use  (and  the  FPPS  criterion  and  original  PDR  concept  is  still  considered  valid),  then 
the  appropriateness  of  the  Bin  model  for  FPPS  estimation  purposes  becomes  an 
important  issue.  In  particular,  NPRST  must  then  determine  what  FPPS  estimation 
methodologies  may  be  more  appropriate  and  more  accurate  than  the  Bin  procedure.  The 
results  in  Table  l  strongly  suggest  that  the  LRM  is  superior  to  the  Bin  procedure, 
particularly  when  both  bias  and  estimation  error  are  taken  into  consideration.  In 
addition,  from  a  mathematical  and  software  implementation  standpoint,  the  LRM  is  no 
more  complex  than  the  Bin  model. 
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Conclusions  and  Recommendations 


If  classification  policymakers  desire  that  the  nominal  and  effective  weights  of  the 
RIDE  components  remain  approximately  equal,  RIDE  must  be  modified  so  that  each 
component  is  standardized  (by  dividing  by  its  standard  deviation)  before  calculating  the 
RIDE  composite  payoff. 

Assume  that  (l)  classifiers  using  RIDE  are  not  under  any  obligation  to  sell  from  the 
top  of  the  optimal  list  and  (2)  unlike  CLASP,  there  are  no  constraints  on  classifier  access 
to  the  lower  portions  of  the  RIDE  optimal  list  and  it  is  equally  convenient  for  him  to  sell 
a  job  from  the  bottom  of  the  list  as  it  is  for  him  to  sell  one  from  the  top.  Then,  the  DIM 
concept  is  not  required  in  the  RIDE  model  because  a  classifier  using  RIDE  has  sufficient 
freedom  to  put  the  applicant  into  a  job  option  with  a  low  optimality  value  if  quota  fill 
and/or  criticality  requirements  dictate  that  he  do  so. 

If  “A”  School  performance  data  is  unavailable  or  is  determined  to  be  infeasible  for 
use  in  the  RIDE  parameter  update  (or  the  current  PDR  concept  is  considered  invalid), 
then  classification  policymakers  should  consider  the  classifier  decision  model  as  a 
potential  alternative  for  redefining  the  PDR  concept  and  becoming  the  conceptual 
framework  for  the  RIDE  parameter  update  process. 

If  school  performance  data  is  available  and  feasible  for  use  in  the  RIDE  parameter 
update,  then  classification  policymakers  should  consider  the  LRM  as  a  replacement  for 
the  Bin  methodology. 
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Appendix  A: 

Derivation  of  EATE  Formula 


A-o 


Derivation  of  EATE  Formula  and  Pseudo  Code  to  Compare 

the  Bin  and  LRM  Models 


Total  Error  s~n(ju,<j  2 ),  //=bias,  cr 2  =  estimation  error  variance. 
Define  <p{t)  =  standard  normal  prob.  density  function, 

Define  0(/  )  =  Standard  normal  cumulative  distribution  function 

EATE  =  E\e\  =  J|^4— ^ 


cr 


o  if/  A  00  1 

EATE=  \-t—(p\-^-  dt+\t—cp 

J  rr  \  rr  J  rr 


<r  J 


rt-/Ll\ 

l  ^  J 


dt 


Substitute  y  =  - — —  and  dv  =  —dt  with  t  =  ay  +  /j 
a  a 


(7  ^ 

EATE  =  J  -  (cry  +  fj)cp{y)dy  +  J  (a  y  +  ju)(p{y)dy 


EATE  =  [a(p(y)~  v  ®(y)} _l  +  [- a(p(y)  + /u<&{y)\ 


A 


cr 


EATE  =  2cr^>  -2//0  ~^\  + ju 


// 


cr , 


Pseudo  Code  to  compare  the  Bin  and  LRM  models: 

Outer  Loop  over  j  =  1,  70  RIDE  job  options: 


Inner  Loop  over  composite  score  Xy ,  where  CS  <  Xt  .  <  CMax  . 

'LRM  Calculation:  Calculate  LRM  EATE  for  current  value  of  Xy . 

Assume  Bias  in  Logit  =  0,  and  Variance  of  logit  is  computed  as  follows: 

The  estimated  logit  L(Xi  . )  is 

For  LLRM:  l(x,  j  )  =  £  PkjX\.  For  QLRM:  L{Xij )  =  £  akJX*j 

k= 0  k= 0 


The  variance  of  the  estimated  logit  is  given  by  the  quadratic  form: 

(l  x 1  )X  (l  x‘  ]  where  Z  is  the  estimated  covariance  matrix  of  the  parameter 


estimates,  x  = 


'  1  ] 

f  1  'j 

x. . 

for  the  LLRM,  x  = 

V  l’J ) 

{Kj) 

for  the  QLRM,  and  the  superscript  t 


indicates  matrix  transposition.  Then,  calculate  EATE  in  Logit  by  substituting  the  logit 


A-l 


bias  =  0  and  the  logit  variance  into  equation  EATE  to  obtain  the  EATE  of  the  logit 
(EATe(l{xJ)  in  LRM  EATE  formula): 


Calculate  EATE  of  LRM  FPPS  rate  estimate  (LRM  EATE)  for  Xy  by 

1  1 


1  +  exp 


1  +  exp 


For  current  job  option  j,  accumulate  sum  of  LRM  EATE  over  all  Xy  . 


Bin  Calculation:  Calculate  EATE(Bin)  =  EATE  of  Bin  model  FPPS  rate  estimator: 
For  current  X,  compute  Bin  bias  =  difference  between  Stage  I  and  Stage  II 
Models  at  the  current  value  of  Xy.  The  stage  I  estimator  is: 


Fj (in)-  Fj (m  - 1) 


(XU  -bl(m-\])+  F(m-\) 


bj(m)-bj(m  -l) 

where  F.  (in )  is  the  FPPS  rate  in  the  777th  bin  of  school  sample  j,  and  Xy  is  located 
between  the  midpoint  of  the  and  777th  bins,  i.e.  b .  (777  —  l)  <  X(  .  <  b  j  (m). 

For  current  X,  calculate  stage  I  model  estimation  error  variance  for  the  current  Xy  satisfying 
bj  (in  —  l)  <  Xt  j  <  bj  (m).  The  variance  is 


xij  -bj(m- 1) 
bj(m)-bj(m  - 1)  I 


Var 


(^w)+- 


bj(fn)-xij 
bj(m)-bj(m  -  l)l 


Var1 


(C ('«-!)) 


where  V«(f>|=^S 

V  ;V  ’  N j  (m) 

The  variance  is  derived  by  computing  the  variance  of  the  stage  I  estimate,  and  using  the 
properties  of  the  binomial  distribution  and  the  independence  of  the  FPPS  rates  in  the  777th  and  (m- 
7)th  bins. 

For  current  X,  calculate  EATE(Bin)  using  the  Bin  bias  and  the  stage  I  estimation  error 
variance  and  substituting  them  into  equation  EATE. 

Accumulate  sum  of  EATE(Bin)  over  all  Xy 

End  Inner  Loop  (CS  <  X t  .  <  CMax  loop). 

For  job  option  j,  compute  LRMEATE  =  average  of  EATE(LRM)  =  mean  EATE 
of  LRM  FPPS  rate  estimates  over  all  Xy  {  Xy  |  CS  <  Xi  y  <  CMax  }. 


For  job  option  j,  compute  Bin  EATE  =  average  of  EATE(Bin)  =  mean  EATE 
of  Bin  FPPS  rate  estimates  over  all  Xy:  {  Xy  |  CS  <  X  i ;  <  CMax  }. 

End  Outer  Loop  (RIDE  job  option  j  loop). 

Compute  Overall  LRM  EATA  =  average  of  Expected  AbsoluteTotal  Error  of  LRM  FPPS  rate 
estimates  over  all  job  options  j  and  all  Xy. 

Compute  Overall  Bin  EATA  =  average  of  Expected  AbsoluteTotal  Error  of  Bin  FPPS  rate 
estimates  over  all  job  options  j  and  all  Xy. 
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Finding  the  QLRM  Extreme  Value  Point 


Show  that  QLRM  has  exactly  one  extreme  value  point  and  find  it.  Show  that  the  extreme  value 
point  is  a  minimum  if  a2  .  >0  and  is  a  maximum  if  a2j  <  0  :  The  QLRM  is  given  by: 


Sfj  =  {l  +  exp[-  (a2JXlj  +  aK/Xt  j  +  a0J )  ] 

dSfj 


Extreme  value  point  is  found  by  solving - —  =  0  for  Xi  j  (Rodin,  1970).  By  the  chain  rule, 

dX , 


dSfj  _  dL  dy 


dXi  J  dy  dXt  j 

y{Xi  j)  =  a2  jX 5  +  ahJX,  j  +  a0J . 
dL  exp(-v) 


where  L(y)  =  {l  +  exp[-  y  ]  }  '  and 


dy 


dv  (l  +  exp(-v))2 


>  0  for  all  y  and  — 1 —  =  2a1  ,X, +  a,  , . 


dX, 


‘,j 


dSf. 

Hence,  - —  =  0  if  and  only  if  2 a2  jXi  .  +  ax  /  =  0  .  Therefore,  solving  this  equation  for  Xi  . , 

dX  ■  , 


we  see  that  the  only  extreme  value  point  occurs  at  X Ext  = 


-a 


i  j 


2  a 


2  J 


2  cQ 


dlS 


Extreme  value  point  XExt  is  a  minimum  iff - —  <  0  and  is  a  maximum  if  and  only  if 


d'Sfj 


0, 

d°XQ 
where  ’"J 


dX: 


dX 


is  evaluated  at  X Ext  (Rodin,  1970). 


dx; 


^ -  > 


d2S?j  d2L 


dXfj  dy 


dv 

A; 


dL  d  y  TT  dv  A  ^  „ 

h - 7- .  However,  — - —  =  0  at  Xi  j  =  XExt ,  so 


dy  dXT  j 


dX: 


d2 Sfj  dL  d2y  _  dL 

-  ~ 


dX 


a,  <  0  and 

L->J 


dy  dXfj 
d2Sf, 


d2  Sfj 

=  2 cr,  ,  —  .  Since  —  >  0 . 


dy 


dL 

dy 


dX 


<  0  if  and  only  if 


dX; 


>  0  if  and  only  if  a2j  >  0.  QED. 
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Use  of  P- value 


In  our  testing  situation,  we  set  a  null  hypothesis  for  the  QLRM  and  a  null  hypothesis 
for  the  LLRM.  The  QLRM  null  hypothesis  is  that  the  X2  coefficient  in  the  QLRM  is 
zero,  while  the  LLRM  null  hypothesis  states  that  the  X  coefficient  in  the  LLRM  is  zero. 
Opposing  each  null  hypothesis  is  the  corresponding  alternative  hypothesis  stating  the 
coefficient  is  non-zero.  For  two-sided  tests  such  as  these,  the  p-value  may  be  defined  as 
the  probability  that  the  test  statistic  is  at  least  as  large  in  absolute  value  as  the 
parameter  estimate  actually  observed  if  the  null  hypothesis  were  true.  Stated  differently, 
the  p-value  represents  the  probability  that  the  experimenter  incorrectly  rejects  the  null 
hypothesis  on  the  basis  of  his  observed  parameter  estimate.  Thus,  a  small  p-value 
implies  small  credibility  for  the  null  hypothesis  and  a  large  p-value  implies  large 
credibility  for  the  null  hypothesis.  Hence,  the  p-value  associated  with  the  X 2  coefficient 
estimate  in  the  QLRM  is  a  convenient  way  to  measure  the  credibility  of  the  QLRM  null 
hypothesis,  while,  the  p-value  associated  with  the  X  coefficient  estimate  in  the  LLRM  is 
a  convenient  way  to  measure  the  credibility  of  the  LLRM  null  hypothesis  (Wonnacott  & 
Wonnacott,  1972). 

In  logistic  regression  analysis,  the  test  statistic  is  the  square  of  the  ratio  of  the 
parameter  estimate  and  its  standard  estimation  error.  In  our  case,  the  LLRM  null 
hypothesis  states  that  the  slope  parameter  in  the  LLRM  is  zero  and,  therefore, 
composite  score  is  not  useful  in  predicting  FPPS.  If  the  LLRM  null  hypothesis  is  false 
and  the  LLRM  alternative  hypothesis  HA  LLRM  :  l  *  0  is  true,  then  composite  score 

results  in  a  statistically  significant  improvement  in  predicting  FPPS.  Under  the  null 
hypothesis,  the  square  of  the  ratio  of  the  estimated  slope  parameter  divided  by  its 
standard  estimation  error  has  a  chi-square  distribution  with  1  degree  of  freedom.  An 
analogous  argument  shows  that  if  the  QLRM  null  hypothesis  is  rejected  and  the  QLRM 
alternative  hypothesis  H A  QLRM  :  aQ  2  *  0  is  true,  then  we  may  conclude  there  is  statistical 

evidence  that  an  extreme  value  point  (i.e.,  maximum  or  minimum)  exists. 
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Features  of  Aptitude/  Difficulty  Utility  Function 


In  Figure  3,  Ua/d(A,D)  is  plotted  on  the  vertical  axis  against  Job  Difficulty  D  on  the 
horizontal  axis  for  fixed  Aptitude  values  A  =  40,  50,  60,  80,  90,  and  99.  Equation 
(Apt_Dif)  indicates  that  this  function  is  both  a  quadratic  in  A  (for  fixed  D)  and  a 
quadratic  in  D  (for  fixed  A).  Each  curve  in  Figure  3  represents  Ua/d(A,D)  for  a  fixed  A. 
Since  each  curve  is  a  quadratic  in  D,  it  has  exactly  one  maximum  on  the  job  difficulty 
interval  between  D=40  and  D=99.  The  maximum,  hereafter  called  DMax(A) ,  occurs  at 
the  difficulty  level  D  that  awards  the  largest  utility  score  for  the  applicant  whose 
aptitude  is  A.  DMax  (. A )  may  be  obtained  by  setting  the  partial  derivative  of  Ua/d(A,D) 
with  respect  to  D  equal  to  zero,  then  solving  for  D: 


cU 


AID 


cD 


=  B0  l  +  2 B22{A  -  100)2(D  -  35)  +  B2  l{A  -  100) 2  +  2 B01{D-  35)  =  0. 


B0l+B2l{A-  100)2 
Duax  ~  35  2B22(A  -  IOO) 2  +2 B01 

Table  D-i  shows  the  DMax(A)  value  associated  with  each  Aptitude  score  between  40 
and  99,  inclusive.  One  can  verily  that  DMax(A)  maximizes  Ua/d(A,D)  for  all  D  by 


observing  that  the  2nd  partial  derivative  of  U  with  respect  to  D  is  less  than  zero  for  40  < 
A  <  99- 


=  252j2U-100)2+25(U<0 

In  addition,  DMax(A )  is  monotonically  increasing  in  A.  As  demonstrated  in  Table  D-i 
and  mathematically  below,  both  DMax(A )  and  Ua/d(A ,DMclx(a))  are  monotonically 
increasing  in  A. 


Differentiating 


dU 


A/D 


d  A 


with  respect  to  A,  we  obtain 


dU 


AID 


d  A 


2(4  -  100)[^2  0  +  B2  \D  -  35)  +  B2  2(d  -  35)2 


Since  A-100  <  o,  B2;0  <  o,  B2,i  <  o,  and  B2>2  <  o,  — —  >  o  for  all  A  between  40  and 

dA 

99,  inclusive,  and  for  all  D  between  40  and  99,  inclusive.  Thus,  for  any  given  D, 
Ua/d(Ai,D)  >  Ua/d(A2,D)  if  Ai  >  A2.  In  particular,  this  is  true  for  D  =  DMax  (a)  . 
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Table  D-l 

DMax(A)  value  associated  with  each  Aptitude  score  between  40  and  99 


Apt 

D/viax(A) 

U(A,DMax(A) 

Apt 

Dwax(A) 

U(A,DMax(A) 

40 

43.1 

33.1 

70 

65.7 

55.6 

41 

43.5 

33.4 

71 

67.0 

57.0 

42 

43.9 

33.8 

72 

68.4 

58.4 

43 

44.3 

34.3 

73 

69.9 

59.8 

44 

44.8 

34.7 

74 

71.4 

61.4 

45 

45.2 

35.1 

75 

73.0 

62.9 

46 

45.7 

35.6 

76 

74.6 

64.6 

47 

46.2 

36.1 

77 

76.3 

66.2 

48 

46.7 

36.6 

78 

78.0 

68.0 

49 

47.2 

37.1 

79 

79.8 

69.8 

50 

47.8 

37.7 

80 

81.6 

71.6 

51 

48.3 

38.3 

81 

83.5 

73.4 

52 

48.9 

38.9 

82 

85.4 

75.3 

53 

49.6 

39.5 

83 

87.3 

77.2 

54 

50.2 

40.2 

84 

89.2 

79.2 

55 

50.9 

40.8 

85 

91.1 

81.1 

56 

51.6 

41.5 

86 

93.1 

83.0 

57 

52.4 

42.3 

87 

95.0 

84.9 

58 

53.2 

43.1 

88 

96.8 

86.8 

59 

54.0 

43.9 

89 

98.6 

88.6 

60 

54.8 

44.7 

90 

100.4 

90.4 

61 

55.7 

45.6 

91 

102.0 

92.0 

62 

56.6 

46.6 

92 

103.6 

93.6 

63 

57.6 

47.5 

93 

105.0 

95.0 

64 

58.6 

48.5 

94 

106.3 

96.3 

65 

59.7 

49.6 

95 

107.4 

97.4 

66 

60.8 

50.7 

96 

108.3 

98.3 

67 

61.9 

51.8 

97 

109.1 

99.1 

68 

63.1 

53.0 

98 

109.6 

99.6 

69 

64.4 

54.3 

99 

109.9 

99.9 
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Disentangling  the  SPSU  and  AFQT  Components 


The  Disentangle  program  looks  at  four  alternative  6j  (theta)  definitions  and  five 
alternative  applicants.  0j  is  where  AFQT  utility  reaches  minimum  (zero)  after  declining 
from  its  maximum.  The  alternative  6j  s  are  given  by  M .  +  No  j ,  where  N  =  l.o,  2.0,  3.0, 
and  3.5.  Mj  +  dj  is  where  AFQT  utility  begins  to  decline  from  its  maximum  value  (too) 
toward  its  minimum  value  of  zero.  The  Disentangle  program  always  defines  M .  +  8j  = 

M .  +-CT.. 

’  2  J 

In  this  example,  we  look  at  one  applicant  whose  AFQT  score  is  90.  N  has  been  fixed 
at  3.5.  Thus,  dj  =  Mj  +3.5a : .  The  applicant  qualified  for  all  jobs  (X  >  CutS).  On  all  jobs 

for  which  composite  score  X  exceeded  the  PDR,  St  j  was  equal  to  too  *  Hardness  factor 
(HF).  Hence,  applicant  achieves  highest  possible  Sij  score  on  all  these  jobs. 

The  applicant’s  AFQT  (90)  was  greater  than  Mj  +  8j  for  all  jobs,  except  NF  and  MM- 
NF.  Hence,  applicant  achieves  highest  possible  Q;  .  =  too  only  on  NF  and  MM-NF.  The 
applicant’s  AFQT  (90)  was  greater  than  9j  only  for  SKS-SG  and  SM-SG.  Hence,  the 
applicant  received  the  lowest  possible  Qt  J  (zero)  on  these  2  jobs.  Other  than  MM-NF, 
NF,  SKS-SG,  and  SM-SG,  his  AFQT  was  greater  than  My  +  Sj  and  was  less  than  0j . 
Therefore,  the  applicant’s  Qi  j  on  these  jobs  was  between  zero  (the  minimum)  and  too 
(the  maximum). 
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